\, 


Python's Guido van Rossum 


#300 JUNE 1999 


He 


tl 


TYEE 


2 


3226 


70992 












































2 


and Where do you want to go today 


in the United States and/or other county 


Windows 


+ 


, MSDN 
crasoht Corporat 


tt 


ICFOSC 


i 


Mi 


are elther registered trademarks or trademarks of M 


gehts reserved 


ALY 


1OR 


SETOSG 


©1999 M 


ft Corporat 


HES. 


107 




















ideas with other developel>: 
MSDN, go te JS 


msdnamicrosoit.cOm o 














WESC 


Time to market. 


Viode: 


Insanely productive. 


Beye) 


Code right. ----.------- ee eee 





beseeed The world’s number one selling 
programmer's editor. 


~ SINGLE USER LICENSE 
WINDOWS 98/95/NT. 


$299 > 


1-888-477-3642 
www.premia.com 











Driven to win. 








P ® 


prema 





aE ORE meinen een re Na Ee TS tae i ii gla ee large ma 

















1] PROGRAMMER 


FEATURES 


DR. DOBB’S JOURNAL 1999 EXCELLENCE IN PROGRAMMING AWARDS 20 
by Jonathan Erickson 

Guido van Rossum and Donald Becker are the recipients of this year’s Excellence in 

Programming Awards for their commitment to technical innovation and open communication. 


USING THE COATS-MELLON OPERATIONAL SPECIFICATION 23 
by Mark Coats, Mark McCloskey, and Theo Molla 

The Coats-Mellon Operational Specification is a methodology for defining user-based scenarios 

that represent a complete and accurate model of system behavior. This article describes how the 
methodology has been used with real-world projects. 


JAVA PORTABILITY BY DESIGN 34 
by John J. Rofrano 

Factory classes ensure that application code remains unaware of the platform it’s running on. John 
describes how his team used factory classes when building an e-commerce catalog search engine 
written entirely in Java. 


CROSS-PLATFORM DESIGN STRATEGIES 42 
by Bob Krause 

Bob discusses the cross-platform architecture his company uses when building applications that run 

on multiple platforms. In doing so, he presents a set of thread classes developed for use on both 
Macintosh and Windows. 


A DNA SEQUENCE CLASS IN PERL 50 
by Lincoln Stein 

The Human Genome Project is a multinational project to determine the entire human DNA sequence 

by the year 2003. Lincoln describes some of Perl’s object-oriented features in his Sequence library, 
which manipulates DNA sequences. 





EXTENSIBILITY IN TCL 64 
by John Ousterhout 

One of the major reasons the Tcl scripting language has been widely adopted is its 

extensibility. Tcl’s creator describes the design decisions he made to ensure this quality. 


WIN32 DRIVERS FOR DIGITAL/VIDEO CAMCORDERS /4 
by Thomas Tewell 

The only thing wrong with the emerging class of digital/video camcorders is the lack of 

software to use them. Consequently, Thomas wrote a complete IEEE 1394 class driver 

package for Windows 98 and his new Sony DCR-PC10 digital/video camcorder. 


EMBEDDED SYSTEMS 


ROTATING A WEATHER MAP 80 
by Robert D. Grappel 

Robert describes and implements an algorithm he developed to efficiently perform rotation of 
graphical weather maps used by airplane pilots. He then suggests techniques you can use to 
optimize other time-limited computer applications. 





DR. DOBB’S JOURNAL (ISSN 1044-789X) is published monthly by Miller Freeman, Inc., 600 Harrison Street, San Francisco, CA 94017; 415-905-2200. Periodicals Postage Paid at San Francisco 
and at additional mailing offices. SUBSCRIPTION: $34.95 for 1 year; $69.90 for 2 years. International orders must be prepaid. Payment may be made via Mastercard, Visa, or American 
Express; or via U.S. funds drawn on a U.S. bank. Canada and Mexico: $45.00 per year. All other foreign: $70.00 per year. POSTMASTER: Send address changes to Dr. Dobb’s Journal, 

P.O. Box 56188, Boulder, CO 80328-6188. GST (Canada) #R124771239. Canada Post International Publications Mail Product (Canadian Distribution) Sales Agreement No. 0548677. 

FOREIGN NEWSSTAND DISTRIBUTOR: Worldwide Media Service Inc., 30 Montgomery St., Jersey City, NJ 07302; 212-332-7100. Entire contents © 1999 by Miller Freeman, Inc., unless 
otherwise noted on specific articles. All rights reserved. 


4 Dr. Dobb’s Journal, June 1999 








INTERNET PROGRAMMING 


CONCEPT-ORIENTED PROGRAMMING 90 
by Brian McConnell 


Concept-oriented programming makes it possible to write software that requires 
far less bandwidth to deliver, and thereby to increase apparent delivery speeds 
significantly. It also creates a mechanism for disseminating reusable code 
throughout the Internet. 


PROGRAMMER'S TOOLCHEST 


A VIDEO FOR WINDOWS ACTIVEX CONTROL 98 


by Ofer LaOr 

Video for Windows lets applications interact with video-capture cards. Ofer 
describes oVFW, an ActiveX control that encapsulates the Video for Windows 
API so that Visual Basic applications can easily interact with video-capture cards. 


COLUMNS 


PROGRAMMING PARADIGMS 109 
by Michael Swaine 


Did Xerox PARC blow it? Has HP lost its way? Can Linux really be for 
dummies? Michael asks and answers these and other questions. 


C PROGRAMMING 115 
by Al Stevens 

Al’s project this month is a Jukebox that maintains a list of standard MIDI 

Format files. 


JAVA Q&A 121 
by James Begole, Philip L. Isenhour, and Clifford A. Shaffer 

Can JavaBeans be shared? You bet, and our authors show you how. Their 

approach is based on a replicated architecture, where each collaborator 

maintains a copy of the shared data. 


ALGORITHM ALLEY 125 
by Bill McDaniel 

QuickSort is nice, but it’s usually implemented using statically allocated 

arrays, and it does not take advantage of already-sorted data. Bill’s 

variation of the Merge Sort addresses both of these weaknesses. 


DR. ECCO’S OMNIHEURIST CORNER 130 
by Dennis E. Shasha 

Dr. Ecco and Liane discover that there’s an art to putting together the 

pieces of a geometric puzzle. 


PROGRAMMER’S BOOKSHELF 133 
by Gregory V. Wilson 

The focus of Greg’s attention this month is The Practice of Programming, 

by Brian W. Kernighan and Rob Pike; How to Build a Beowulf, by 

Thomas L. Sterling, John Salmon, Donald J. Becker, and Daniel F. Savarese; 
Developing Visual Basic Add-ins, by Steven Roman; and Graph Drawing: 

Algorithms for the Visualization of Graphs, by Guiseppe di Battista, Peter Eades, 
Roberto Tamassia, and Ioannis G. Tollis. 


Dr. Dobb’s Journal, June 1999 








JUNE 1999 
VOLUME 24, ISSUE 6 





FORUM 


EDITORIAL 8 
by Jonathan Erickson 

LETTERS 12 
by you 

NEWS & VIEWS 18 
by the DDJ staff 

OF INTEREST 142 
by Eugene Eric Kim 

SWAINE’S FLAMES 144 
by Michael Swaine 


RESOURCE CENTER 


As a service to our readers, source code and 
related files, and author guidelines are available 
at http://www.ddj.com/. Source code is also 
available via anonymous FTP from ftp.ddj.com 
(199.125.85.76). Letters to the editor, article 
proposals/submissions, and inquiries can be sent 
to editors@ddj.com, faxed to 650-358-9749, or 
mailed to Dr. Dobb’s Journal, 411 Borel Ave., 
Suite 100, San Mateo, CA 94402-3522. 

For subscription questions, change of address, 
and orders, call 800-456-1215 (U.S. or Canada). 
For all other countries, call 303-678-8475 or fax 
303-661-1181. E-mail subscription questions to 
ddj@neodata.com or write to Dr. Dobb’s Journal, 
P.O. Box 56188, Boulder, CO 80322-6188. 

Back issues may be purchased for $9.00 per 
copy (includes shipping and handling). For issue 
availability, send e-mail to orders@mfi.com, fax 
to 785-841-2624, or call 800-444-4881 (U.S. and 
Canada) or 785-838-7500 (all other countries). 
Back issue orders must be prepaid. Please send 
payment to Dr. Dobb’s Journal, 1601 W. 23rd 
Street, Suite 200, Lawrence, KS 66046-2700. 

Individual back articles may be purchased 
electronically at http://www.ddj.com/ as ZIP 
archives. 


NEXT MONTH 


Windows CE and the PalmPilot, JINI 
technology, Internet telephony and security: 
We'll look at the latest in communication 
and networking in July. 








Real-world data management solutions 
are typically more complex when one 
examines the pieces, than initially 
recognized by the majority of database 
programers. All software projects are 
complex puzzles comprised of many 


details, most of which are data-related. 


Often today’s “DBMS” solutions sacrifice 
the speed or control essential for a 


competitive application. 


c-tree Plus®, by FairCom, has been the 
choice of commercial developers for twenty 
years precisely because it offers the 
flexibility and control at the detail level to fit 
a wide variety of data management needs. 
Proven on large Unix servers and 


workstations, c-tree Plus’s small footprint 
and exceptional performance have also made 


it the engine of choice for professional 


developers on Windows and Mac. c-tree Plus 
offers sophisticated ISAM level control with 
which the developer may define precise data 
management solutions, making it a perfect 
fit for any development project requiring 
specific data handling features. 


c-tree Plus® offers the most 


mature ISAM solution today. 


FairCom’s 
c-tree Pius 
database 
engine: 


e Advanced Indexing 
Technology 

* Complete Source Code 

¢ Complete Transaction 
Processing 

° ODBC Interface 

¢ Over 25 Developer’s 
Servers Included 

e Portable Multi-threaded 
API 

* Royalty Free 

e Standalone, Multi-user or 
Client/Server Models 

¢ Thread-Safe Libraries 

e¢ Y2K Compliant 





Don't wait, see for yourself! 
USA. 800.234.8180 


ae 
Fair 
Database SOLUTI ad | 


Phone: USA 573.445.6833 - EUROPE +39.035.773.464 
product and operating platform names are re 





Other company, 








The FairGom 
Server: 


A solid, high performance 
database server that is 
scalable, portable and offers 
unequalled control. FairCom 
has been providing database 
solutions to the commercial 
development community for 
twenty years. You won’t find 
a better solution, with these 
features and performance 
anywhere else! 

° Client Side Source Code 
e File Encryption 

e File Mirroring Logic 

* Full Conditional 

Index Support 
¢ Full Heterogeneous 

Networking 
° Multiple Communication 

Protocols 
e Online Data Backup 
¢ Small Memory Footprint 
e Flexible OEM 

Licensing Options 
¢ Source Code Availability 





All these 
platforms 
supported in one 
package: 


MIPS ABI, DEC Alpha, Sun 


SPARC, Windows 9X, SCO, 


S8OPEN, AIX, RS/6000, 
HP9000, Sun OS, 
Interactive Unix, Linux 
(Alpha...), AT&T System V, 
QNX, Free BSD, OS/2, 
Mac, Windows NT, 
Windows 3.1, DOS, 
Netware NLM, & Banyan 
VINES. 

















ince 1979 


- BRAZIL +55.11.38872.8802 
idemarks of their respective owners. 












SOFTWARE 

Dr Dobb So FOR THE 
8 PROFESSIONAL 

| 0 ee: alee Pam | PROGRAMMER 


PUBLISHER 
Timothy Trickett 


EDITORIAL 

EDITOR-IN-CHIEF 

Jonathan Erickson 

MANAGING EDITOR 

Deirdre Blake 

MANAGING EDITOR, DIGITAL MEDIA 
Kevin Carlson 

SENIOR TECHNICAL EDITOR 

Tim Kientzle 

TECHNICAL EDITOR 

Eugene Eric Kim 

SENIOR PRODUCTION EDITOR 
Monica E. Berg 

EDITORIAL ASSISTANT 

Amy Lincicum 

ART DIRECTOR 

Margaret A. Anderson 

INTERNET BROADCAST PRODUCER 
Philippe Lourier 

CONTRIBUTING EDITORS 

Al Stevens, Tom Genereaux, David Betz, Bruce Schneier, 
Mark Russinovich, Bryce Cogswell, Ray Duncan, 
Jack Woebr, Jon Bentley, Dennis Shasha 
EDITOR-AT-LARGE 

Michael Swaine 

PRODUCTION MANAGER 

Denise Denis 


CIRCULATION 

DIRECTOR OF CIRCULATION 
Jerry Okabe 

GROUP CIRCULATION MANAGER 
Michael Poplardo 


MARKETING/ADVERTISING 


SALES DIRECTOR, EAST 

Brenner Fuller 

SALES DIRECTOR, WEST 

Paul Miller 

MARKETING DIRECTOR 

Holly Vessichelli 

DIGITAL MEDIA MANAGER 

Michael Calderon 

MARKETING ASSISTANT 

Marquita Tinio 

ACCOUNT MANAGERS see page 136 

Stan Barnes, David Katch, Gabriel Rogol, 
Michael Beasley, Tom Hudner, Brenner Fuller, 
Michael Kelleher, Linda Guyette, Keith Johnson 
Elizabeth Doherty, Stacey Mochizuki 
GRAPHIC DESIGNER 

Carey Perez 


> 


DR. DOBB’S JOURNAL 
411 Borel Avenue, San Mateo, CA 94402-3522 
650-358-9500. http://www.ddj.com/ 


MILLER FREEMAN INC. 


CEO/MILLER FREEMAN GLOBAL 

Tony Tillin 

CHAIRMAN/MILLER FREEMAN INC. 
Marshall W. Freeman 

PRESIDENT 

Donald A. Pazour 

EXECUTIVE VICE PRESIDENTS 

H. Ted Babr, Darrell Denny, Galen A. Poss, 
Regina Starr Ridley 

SENIOR VICE PRESIDENTS 

Wini D. Ragus, Peter Hutchinson 

VICE PRESIDENT/PRODUCTION 
Andrew A. Mickus 

VICE PRESIDENT/CIRCULATION 

Jerry Okabe 

VICE PRESIDENT/GROUP PUBLISHER 
Peter Westerman 


Estimated print run 170,000. 


Un Miller Freeman 


A United News & Media company 





P Printed in the 
American Buisness Press USA 


WBPA 


Dr. Dobb’s Journal, June 1999 


C++ components from Rogue Wave: the choice of champions! 





I. can be a stretch to create portable, scalable apps and deliver them on time. And Save the day! 


building and testing your own low-level C++ components can really put your game at risk. 
Why not let Rogue Wave help you cover all the bases? With the integrated and reusable classes 
in Tools.h++ Professional, Threads.h++, and Standard C++ Library, your development team can 
get a head start on building a solid, high-performance foundation for every application. Know 


the score: use C++ components from Rogue Wave, and your whole team will be batting 1000! 


Get the stats. Download your free demos and white papers now at: 
www.roguewave.com/ad/catch 


“ Fundamental classes 

& Networking classes 

Y Multithreading classes 

% Solutions for Java/C++ 
interoperability 


ogue Wave 


SOFTWARE 
Components Without Limits 





Rogue Wave: 800-487-3217 * D.A.CH.:+49-6103-59 34-0 * UK:+44-118-988 0224 
France: +33-1-4196 2626 » Italy: +39-02-3809 3288 + Rest of Europe: +31-20-301 26 26 


Rogue Wave and .h++ are registered trademarks of Rogue Wave Software, Inc. All other trademarks are the property of their respective holders. 





EDITORIAL 





What? 
Me Worry? 





Photo courtesy of Jon Blumb. 


viewing won't be interrupted by the Y2K problem. The 
bad news, according to the rest of us, is that television 
viewing won't be interrupted by the Y2K problem. a > 

Yep, as we wade through the waning months of 1999, | Protection | 
Y2K hysteria is mounting, spurred on by every con man, | _ imerice 2K Hrotected Community 
crackpot, and huckster who, up to now, has been peddling 
Ginsu knives, beachfront property, and Internet stocks. 
Don’t believe me? Take a look at http://www.y2-kash.con7/, 
which squawks, “Don’t say Doomsday, say Payday!!” or the 
Y2K: Disaster 2000 Millennium Survival Guide, a pulp rag 
that can’t decide whether it’s an in-your-face Mother Earth News or kinder, gentler Soldier of Fortune. 
With articles ranging from “Countdown to Chaos!” and “Y2K Crime Alert!” to “The Gun for all 
Reasons” and “Wood Stoves: Reliable Heat Y2K Can't Stop!” (Phew! All those exclamation marks wear 
me out!), and ads hawking everything from gun accessories to a year’s supply of Texas chili (thanks, 
but I'll stick to starvation), the magazine speaks to fear rather than fact, chaos rather than cooperation. 

It’s drivel such as this, in fact, that’s more worrisome than Y2K disruptions. University of Texas 
historian Howard Miller, for instance, fears that sensational media disinformation will intensify 
over the coming months and panic will result. The International Association of Emergency 
Managers concurs, recently telling the U.S. House of Representatives that panic could outweigh 
the impact of any breakdowns in basic services. 

Of course, my local bank isn’t inspiring much confidence either. In a recent letter, the bank bragged 
that “We will be prepared to conduct business as usual on January 1, 2000.” Gee, considering that 
January 1 is a holiday with no business conducted...well, you understand my concern. 

So, assuming the worst at the end of the year, where can we go to escape Y2K chaos? My money’s 
on Protection, Kansas —“America’s first Y2K protected community.” Granted, protecting Protection, 
population 560, isn’t the biggest job in the world. But say what you will, Protection’s 200-plus 
computers are Y2K safe, and the town sure knows how to throw a party. Not that being protected is 
new to Protection. In 1957, the town became the first in America to be protected from polio. Amidst 
the hubbub of parades, bands, rodeo performers, and stunt cars, residents received Salk polio vaccine 

ae inoculations. Picking up where Jonas Salk and the Foundation for 
Infantile Paralysis left off, @Backup Chttp://wwwbackup.com/), 
an online data management company, launched “Project 
Protection,” providing free online backup of commercial, 
municipal, and personal data for an entire year. In honor of the 
event, we packed up and headed cross country for peace of 
mind, free barbecue, and a chance to see environmental artist 
Stan Herd’s (http://www.stanherd.com/) latest crop art— a 650- 
foot-long “view from above” work of art growing in the middle 
of a Kansas wheat field. And we weren't disappointed. The 
barbecue was great, and Herd’s rendition of the @Backup logo is, 
as you Can see, amazing. 

There was, of course, the obligatory message from the sponsor, @Backup in this case. Unlike 
most of what you hear about Y2K these days, @Backup vice president Melinda King didn’t try to 
scare the local population into hiding in storm cellars come December 31. Much of what she had 
to say, as you might expect, involved the virtues of backing up data. 

If you stop and think about it, backing up is perhaps the easiest and most straightforward 
protection for potential Y2K problems. (Even my bank has figured this out, saying in the 
aforementioned letter that “backup records could be used to identify and correct errors...due to a 
year 2000 computer problem.”) @Backup’s shtick is “online” backup, whereby changes on your 
hard disk (new files or modified portions of existing files) are automatically uploaded (direct 
modem or via the Internet), encrypted, and stored at the San Diego Supercomputer Center. 
@Backup isn’t alone in this market. Others in the online backup biz include Intel with its 
AnswerExpress service (http://www.answerexpress.com/) and the Triangle Research Group’s Saf- 
T-Net (http://www.trgcomm.com/). All three are similar in price (about $30/month) and provide 
services such as CD-ROM archives. Feature comparisons aside, what there is no question about is 
that @Backup and Protection put on the best parties. You gotta love that barbecue. 


Soren 


T: good news, according to the FCC, is that television 





onathan Erickson 
editor-in-chief 
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Real-Time Redux 


Dear DDJ, 

Norman Dotti makes excellent points in his 
“Real Real-Time” letter (DD/, February 1999). 
Regarding real-time instruments such as FFT 
spectrum analyzers, he distinguishes be- 
tween those that can acquire and display 
data continuously with no gaps, versus those 
that only give the visual appearance of con- 
tinuous data. Because our Data AcQuisition 
And Real-Time Analysis (DAQARTA) share- 
ware allows both approaches, Id like to 
comment on the virtues of each. 

In a typical real-time analyzer, when the 
sample rate is adjusted to the point where 
processing just keeps up with acquisition, 
all of the input data is reflected in the dis- 
play. If that sample rate is, for example, 
20 kHz when processing 1024-sample FFTs 
(typical for Dagarta running on an old 386 
system), then the display update rate is 
20000/1024, or 19.5 screens per second. 

At higher sample rates the display shows 
only the most recent data, but still updates 
at the same rate. In theory, a signal could 
contain time-domain transients that fell 
“between the cracks” of displayed spectra 
and were thus overlooked. With a faster 
system, these might turn up as occasion- 
al jumps in the noise floor of the spec- 
trum...assuming you could spot an infre- 
quent flicker at such high display rates. 

Although it’s pretty unlikely that such 
transients would be synchronized with the 
data gaps, they could be detected by re- 
ducing the sample rate to eliminate the 
gaps. But if you know there are transients, 
then a better approach is to trigger on 
them: You can then observe the transient 
alone, or the “clean” data before or after 
the transient using a deep data buffer. 
There is no problem with missed data at 
high sample rates, because the analyzer 
waits for the trigger before processing only 
the desired synchronous data. 

In fact, triggering is useful even with 
repetitive signals, to give a more stable 
spectrum. And for viewing waveforms in- 
stead of spectra, triggering is practically 
mandatory for a stable display. Even more 
important, proper triggering allows time- 
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domain (waveform) averaging for im- 
pressive noise reduction of evoked re- 
sponses or other repeating signals. So if 
you want a fast sample rate, there is re- 
ally no need to forgo it or to buy a faster 
system just to stay within the real-time 
limits of your analyzer. 

This discussion presumes that the data- 
acquisition board or sound card can use 
DMA or FIFO interrupts to acquire data 
in the background while processing in the 
foreground. But many popular laborato- 
ry boards lack these facilities, and rely 
upon interrupts to acquire each sample. 
At high sample rates the interrupt over- 
head slows processing so much that a se- 
quential mode of operation is better: In- 
stead of interrupts, foreground polling is 
used until enough data is acquired, then 
that data is processed before going back 
for more. As the sample rate goes up, the 
time to acquire the data goes down. Pro- 
cessing time is unchanged, so the display 
update rate actually rises. 

Sequential mode allows typically twice 
the throughput of real-time mode for these 
boards, particularly for evoked response 
applications using simultaneous stimulus 
generation. As an example, a basic DASO8/ 
Jr-AO board (under $200 from Computer 
Boards or Cyber Research) in a 386DX-40 
system can run at its maximum ADC rate 
of nearly 40 kHz while simultaneously out- 
putting different tone burst complexes from 
both DACs at 120 kHz each, for a fully 
synchronous aggregate of almost 280 kHz. 

And even for boards that don’t need it 
for speed, sequential mode makes it easy 
to poll a TTL input to act as an external 
trigger, or to produce a TTL trigger pulse 
to synchronize external equipment. There 
are many real-world applications where 
such features and performance are much 
more important than the opportunity to 
view every last sample in untriggered true 
real-time mode. 

Of course, there is a trick to getting this 
kind of power and performance (in addi- 
tion to 100 percent assembly language with 
a custom just-in-time optimizer that was fly- 
ing when Java was only a brew): Daqarta 
drops down to real-mode DOS. As Norman 
points out, Windows is not a real-time op- 
erating system, and even a real-time oper- 
ating system may be hopelessly inadequate 
for high-speed operation without special 
hardware. That same DAS08/Jr-AO board 
under Windows would be limited to a sam- 
ple rate of only a few kHz, with no simul- 
taneous outputs, even on a fast system. 

What about NT? With latencies running 
to the hundreds of milliseconds, NT is an 
even worse choice. “But NT is supposed to 
be more stable,” you say. Yes, indeed. ..but 
“stable” in the context of one application 
not crashing another. For a single applica- 








tion, real-mode DOS is by far the most 
stable...its a “Don’t just do some- 
thing...stand there!” kind of system, as 
compared to the “But I was only trying to 
help!” approach of Windows and NT. 

And as Norman notes, loss of data is a 
real issue; you just never hear it mentioned 
by vendors of multitasking data-acquisition 
systems. Exactly what applications can run 
concurrently without risking your data? 
How will you know when you’ve exceed- 
ed the limit? Consider that nearly everyday 
someone complains to the sound card tech 
newsgroup about “stuttering sound when 
I move the mouse;” and ask yourself how 
to detect a similar corruption in physio- 
logical signals or machine vibrations, where 
people don’t ordinarily listen to the data. 

So Norman’s wise advice bears repeat- 
ing: Make sure you know what the other 
guy means by “real-time”! 

Bob Masta 

tech@daqarta.com 


Online Op/Eds 

Dear DDJ, 

If the Online Op/Ed “Windows: Linux’s 
Secret Weapon,” by Lou Grinzo, had been 
written a year ago I would have asked for 
the crystal ball and complimented him on 
being a visionary. As is, it just looks like 
he has been paying attention. 

Right now, Corel has 16 programmers 
on staff working to complete WineLib (part 
of the Wine Project). Simply put, this is a 
stab at making Windows programs run on 
UNIX Linux. The Lib portion is to ease the 
porting of Windows apps to native Linux. 
Of course, Corel is helping because they 
have more legacy Win32 code than any 
other software company (except Microsoft). 

Item two. Drop by http://www.kde.org 
and read the archives in the kde-look@ and 
kde-devel@ lists. That looks like these peo- 
ple spend an awful lot of time working on 
usability and trying different approaches to 
making the interface ergonomic. 

Kevin Forge 

forgeltd@usa.net 


Dear DDJ, 

After reading Tim Pfeiffer’s Online Op/Ed 
“Windows DLLs: Threat or Menace?,” I felt 
compelled to respond. Pfeiffer’s simple 
solution “don’t use them” is actually not 
so simple. Not using any DLL is hardly 
possible: You would need to find static 
linkable equivalents for GDI, USER, KER- 
NEL, and other core components of Micro- 
soft Windows. Not using any DLL, except 
for system DLLs does not solve the prob- 
lem: Many versioning problems that ap- 
plications have are in fact due to updates 
of system DLLs (COMCTL32.DLL, for ex- 
ample) through the installation of “office 
suites” or web browsers. 
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Je is the world's first automatic white-box module testing tool for Java. With a single click, jtest! finds 
your undocumented, uncaught runtime exceptions. Unlike most Java products, J’ also checks your 
internal code structure for weaknesses and monitors code coverage to ensure complete testing. 


Serious programmers and corporations are using J” on all of their Java code. Join them, and find out 
how easy it is to produce code that is clear, concise, and easy to maintain! 


Sa itest! 2.03 ée isco Jiest is SO easy to use, you can start finding bugs 
automatically at the click of a button. 


When Jiest finds an 
exception, it shows you 
the exact code responsible 
for the exception, making 
bug-fixing fast and pain- 
less. 













Suppressions Table 


With Jeet’, it's easy to focus on the 
exceptions that are most important to you. 
You can customize reports to suppress 
exceptions by type, name, 
and enclosing method. 





_ Luke Cassidy-Dorion. 
Java Pro, February 1999 
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(continued from page 12) 

The use of DLLs instead of static libraries 
indeed may be a potential cause for prob- 
lems for which there is no easy fix. My 
own approach is to include version checks 
in each application, and to warn for con- 
flicts. This at least keeps the end user in- 
formed. As for storing the locations of 
DLLs, I much prefer the use of local con- 
figuration files to cluttering the central reg- 
istry (or the WIN.INI, for that matter). In- 
stalling a DLL in the Windows “system” 
directory is usually not a good idea. 

Thiadmer Riemersma 

thiadmer@compuphase.com 


Dear DD], 

I enjoyed the Online Op/Ed entitled “Win- 
dows: Linux’s Secret Weapon,” by Lou 
Grinzo at http://www.ddj.com. I would 
like to make one comment, however, re- 
garding the “insular” mindset of Linux de- 
velopment to date. 

The OS, as you probably know, is not 
only undergoing rampant development 
but also a redefinition of its clientele. While 
previous versions may well have been suit- 
ed only to highly technical programmers 
with a UNIX background, that is chang- 
ing. In the past year, usability has been 
drastically improved through work on the 
window manager KDE and the desktop 
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system GNOME, which have the ability to 
make many programs appear almost iden- 
tical to those running on Windows (or the 
Mac). This alone is a great thing—I’m so 
used to Windows, sometimes I have a hard 
time “thinking outside the box.” But with 
the ever-present discussion on Linux’s role 
in the desktop market, there has been de- 
bate about what to do with legacy utili- 
ties that relied on a command-line inter- 
face and textual config files. Joe Blow 
doesn’t want to type everything, but Mr. 
Hacker doesn’t want his hands tied by any 
given user interface, which tends to rigid- 
ly structure, if not limit, what you can do 
with those utilities. The command-line in- 
terface, after all, is extremely flexible. 
The most common solution I’ve seen, 
and | think it’s a good one, is to develop 
a GUI that controls the original command- 
line version. This is happening for all man- 
ner of applications, from package instal- 
lation to desktop configuration, Apache 
(WWW server) configuration, you name 
it. In this way, users who resist or dislike 
the CLI can see the util’s pretty side, while 
the grunts can still get the good old CLI 
they crave. It works to everyone’s benefit. 
In fact, I think it would be shameful for 
more programmers, Windows or no, to 
rewrite solid proven apps in order to re- 
strict those apps to a GUI interface. Linux 


seems to be about openness and meeting 
everyone’s needs. A commendable goal. 

Regarding the state of autofs and the 
mount/umount situation, I had a Mac 
friend who constantly complained that 
Windows couldn’t tell if there was a disk 
in the drive or not, in that no icon showed 
up when you popped one in. There’s no 
visual clue what’s going on. And both 
Windows (DOS) and Mac sometimes at- 
tempt to access a disk that isn’t there and 
complain about it. I’m not sure if there is 
a universal solution to this problem. ..us- 
ability means different things to different 
people. My Mac friend thought the icon 
thing was stupid. I think it’s stupid to drag 
some icon to the trash just to get your disk 
out. I like the push button floppy drive. I 
don't really care for mount/umount, but 
I can’t really think of a better solution. 

I should point out that there is strong 
development in the areas of plug-and-play 
recognition and power management in the 
2.2 kernel, and this is ongoing. Perhaps this 
year more people will regard Linux as a 
strong desktop contender. I already do. 

Michael Coddington 

madrid@bway.net 
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HASP® Helps You Break 
Sales Records 


The Winning Advantage Is that HASP actually creates new sales 
opportunities while providing you with unparalleled security. 





HASP’s Remote Update System makes it possible for you to 
remotely modify the execution of software modules. For 
example, you can easily transform a demo into a fully 

licensed version or you can grant extra licenses to a 
network application. 


HASP enables all kinds of exciting new sales strate- 
gies: module sales, in-the-field upgrades, try-before- 
you-buy, enhanced licensing capabilities plus run- 
or time-limited product use demos. Since HASP 
produces 100% user registration, you can sell 
more upgrades and add-on products too. 


HASP protects software. And HASP helps 
Sell it. Is it any Wonder twenty-five 
thousand winning software developers 
depend on HASP? 


Call and order your Developer's 
Kit today! 


ar 












» 
+ ” 4 $ 

ee %@ id we * “ 
G4 ” as 8 

or” a ge ™ 

Z 


: ww.aks.com/dobbs$- e ie 


«KNOWLEDGE SYSTEMS LTD ot. ~ 
aA . é” é ot as" : 
j ee he a . * - ® : , 4 


Peres 
a # 


td 
oe 
‘. 











ms - & 
hy 





North America Aladdin Knowledge Systems Inc. Chicago Tel: +1-800 562-2543, +1-847 808-0300, International +972-3-636-2222 
New York Tel: +1-800 223-4277, +1-212 564-5678, | Email: hardlock.sales@hardlock.com Email: hasp.sales@aks.com 
Email: hasp.sales@us.aks.com HASP is a registered trademark of Aladdin Knowledge Systems, Ltd. 







~~ SPREAD 


by FarPoint Technologies 
Use FarPoint’s Spread 3 to create 
powerful database front-ends, manage 
the display and entry of two billion items 
using the enhanced Virtual Mode, print 
reports using Print Preview, perform 
calculations, import/export Excel 97 files 
(32-bit versions), export HTML files (32-bit 
versions), sort data, support OLE Drag 
and Drop, or take advantage of its 
unparalleled cell-level formatting, 
including twelve built-in cell types. 
Unmatched power, flexibility, and speed 
combine to make Spread the perfect 
component for any application. 
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by Informatics 
e Supports all major bar code 
symbologies. Optional support for 
new 2D symbols (e.g. PDF417, 
MaxiCode, DataMatrix) 
e Render spec compliant bar codes to 
any device context 
e Two ActiveX controls fully support 
Drawing and Font DLLs and can be 
bound to a database. 
Exclusive Offer. FREE Wasp Bar Code 
Scanner ($175 value) w/purchase. 
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by Wise Solutions | mone 


Wise Solutions introduces 
a new generation of 
versatile installation suites 
for a wide range of 
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developer needs with new features and enhancements that 
make installation scripting, patching, repackaging, and Web 


deployment a snap. 


InstallBuilder 
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by FairCom 


DOS ¢ WINDOWS e NT @ UNIX 

e OS/2 © SUN ¢ RS6000 © HP9000 

e MAC @ ONX @ LINUX @ SCO. 

This well known, highly-portable 

data management package has become 
established as the tool of choice for 
commercial development. Offering 
unprecedented data control, choose from 
direct low level access, ISAM level, or ODBC 


access with the FairCom Server. Single User, 


Multi User, or optional Client/Server, ANSI 
Standard. Full Source. 


No Royalties. Supports 25 0/S! 


by LEAD Technologies, Inc. 


LEADTOOLS is a family of 
comprehensive toolkits designed 
to help programmers integrate 
color, grayscale, document, 
medical, multimedia and vector 
imaging into their applications 
quickly and easily. Whatever your 
programming needs, LEAD has a 
toolkit specifically designed to 
give you the best imaging 
technology available with the stability 
and dependability you would expect 
from the imaging LEADer. 


by Premia 

CodeWright Pro, the programmer's 
editor for Windows, features include: 
fast code browsing with Outline 
symbols; Difference Editing to 
selectively combine changes from 
two revisions; create complex 
function calls by just filling out a 
form; infinite extensibility with 
three macro languages; acclaimed 
sync Integration Technology and 
solid support for Web development. 
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by Seapine Software 


TestTrack is the fastest and 
most complete multi-user 
bug tracking solution for 
Windows 95/NT. Tracks bug 





Simply Better Bug Tracking 


and feature requests, customers, users, test configurations, and 
more. Advanced features include e-mail notifications, e-mail bug 
import, duplicate bug handling, release note generation, and 
much more. Distribute TestTrack’s standalone bug reporter to 


your customers to automate Customer 


support. With all of its power, TestTrack 
remains the easiest bug tracking solution 


to use and maintain. 


by Intel 


The comprehensive solution for 
developers of high performance 
software for the Intel Pentium® III 
processor—from the people who 
designed it. 

Intel offers the VIune™ Performance 
Enhancement Environment 4.0, a 
software performance solution for 
the Pentium® Ill processor, which offers 
a set of tools to help you analyze and 
improve the performance of your 
software for the Intel Architecture. 


by Forefront 

The Number One 
WYSIWYG Help 
Authoring Leader 
You Need ForeHelp 3! 


-® Over 200 unique authoring 


time-savers 


e The popular environment 
designed exclusively for 
Help authoring (not a 
Word add-on) 


e Quick, easy, and powerful. 


by FarPoint Technologies 


Easily create professional data-entry 
screens using Input Pro’s eight custom 
controls. Input Pro's formatted-edit 
controls automatically validate 
date/time, numeric or text data 
entered by your user, database or as 
the application programmatically 
assigns or edits values. Use the 
boolean control to display a custom 
true/false/grayed value appearance. 
Use the memo control to display large 
amounts of text, even those eae 
than 64K. 
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Quasit NT ServiceMaster 


by Quasit Technologies 


NT ServiceMaster is a suite of 

4 powerful controls that lets you 
build, control, and communicate 
with (using included socket 
controls) Windows NT Services, 
quickly and easily. Just think, it 
takes only one method call to turn 
your VB app into a robust, manageable 
service that can run unattended— 
even with no users logged in. You can 


RoboHELP* Office 7.0 


by Blue Sky® Software 
RoboHELP Office—The Industry 
Standard in Help Authoring 
Developers worldwide use best-selling 
RoboHELP Office to create professional 
application Help and electronic document- 
ation as well as enhance intranets and ee aoe 
online books with rich navigation and structure. 
e Creates cross-platform systems from one source 
e Features a world class HTML editor 
e Generates printed documentation 
in sync with online systems 
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sha ta ee 006 0310-FH e Winner of 40 industry awards, including B13 0258-FH Adobe Acrobat 4.0 Upgrade 
, $ 95 Programmer's Paradise Riding the Crest . SOF 95 
1 23. Award for Best Selling Help Authoring Software. 875 Paradise No. A14021L-FH 35. 
: Te Borland C++Builder Pro Comp. Upgrade 
re TRACKGEAR with 5 Users Paradise No. B190432-FH == °265. 















by Rational Software by LogiGear’ InstallShield Professional Upgrade from 5.1 


































Rational Visual Test 6.0 delivers new Track your bugs Anytime, , . $ 95 
levels of productivity and power for Anywhere, Any Way you want. Paradise No. 121 0333-FH 289. 
developers and testers deploying SOA industry leaders and seasoned Macromedia Flash 3.0 for Windows 
mission-critical applications on software-development teams bring you : 95 
Windows 95/NT or the Web. TRACKGEAR™, the Web-based bug track- Paradise No 6n02 0123 1H “92. 
Rapidly create tests for applications Rational Visual Test ing solution that is remarkably powerful, MKS Toolkit Upgrade 











of virtually any size, created with any 
development tool. 


Features and Benefits Paradise No. 


e Web testing (HTML/DHTML) Paradise No. Upgrade 
e Tight Integration with R04 0B10-FH R040B11-FH 


Microsoft Visual Studio 6 95 95 
e Active Accessibility Support. 5689. $239. 


Visual Intercept Enterprise 


flexible, and simple to use. Built for the 
Web from the ground up, TRACKGEAR™ 
is designed for your entire product team. 
Technical groups will marvel at the power of —_— paradise No. 
TRACKGEAR™ while non-technical staff will 129 0110-FH 
appreciate how quickly TRACKGEAR™ allows 

them to get the information they need. $745 = 


t Includes 5 users. 
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DevI rack 3.0 


a | by TechExcel 


DevTrack is the premier defect- and 
project-tracking tool designed specifically for 






by Elsinore Technologies 

The only suite of project-oriented 

enterprise bug-tracking tools for 

Microsoft developers. software development teams. DevIrack 
comprehensively tracks and manages all 


Complete integration with Visual "jg ~~ | | defect information, feature requests, and 


Studio—track you bugs without development issues. Intuitive and powerful, 
leaving your IDE! 











































Seagate Crystal Reports Pro Upgrade 
Paradise No. S08 0130-FH *169.* 





DevTrack provides an integrated client/ 


















VBA-enabled for complete customization! server and intranet/Internet solution for oo Visual Café Ent Suite Ups. i 
Status promotion model for controlling enterprise project management. DevTrack Paradise No. $73 011K-FH 1,949. 
workflow. Integration with Visual Paradise No. | features universal ODBC support, MS Visual _— Paradise No. : : | 

SourceSafe, PVCS, ClearCase and E23 0520-FH | SourceSafe integration, client/server 134 0130-FH True _— Pro Upgrade cee 
other SCC systems. Web-based interfaces architecture, e-mail notification, extensive Paradise No. A34 0122-FH 195. 
to Visual Intercept are also available! 865 customization, and presentation-quality $309.* . 


* Price after manufacturer's $50 mail-in rebate. 
** Upgrade from Visual Café 3.0 Database Development Edition. 


reports & graphics. (Formerly PowerTrack.) 


True DBGrid Pro 6.0 | SPF/Sourcekdit v.2.5 


by APEX Software Corp. 


The World’s Most Popular Grid 
just got better... Again! 

New 6.0 features include: Native 
support for ADO and OLE DB (including 
master-detail and hierarchical data 
sources); Multi-column sorting with an 
enhanced XArray object; OLE drag and 
drop; Merge cells; Full support for IE 

and ActiveX compatible browsers; Export . 
to HTML; Enhanced styles for displaying Paradise No. 
in-cell graphics (including background and A34 0120-FH 
transparent bitmaps); Formatted preview $339 95 


and printing; and much more! 
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by Command Technology 


SPF/SE provides ISPF style 
file management and editing 
for Win 95/98/NT 


Features include: ISPF/PDF command 
set, source colorization; ASCII or 
EBCDIC character set; Variable or 
Fixed records; record lengths to 64K; 
file sizes to 100M; Hex display/ . 
modify; Mappable keys; “C” macro Paradise No. 
language; Dialog Definition C29 0310-FH 


Language; and much more ... $7 1 Ol 95 
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Jonathan Erickson 


r. Dobb’s Excellence in Programming Awards 

are presented annually to individuals who, in 

the spirit of innovation and cooperation, have 

made significant contributions to the ad- 
vancement of software development. Recipients of 
previous Dr. Dobb’s Excellence in Programming 
Awards include: 


e Alexander Stepanov, developer of the C++ Standard 
Template Library. 

e Linus Torvalds, the force behind the Linux oper- 
ating system. 

e Larry Wall, the author of the Perl programming 
language. 

e James Gosling, chief architect of Java. 

e Ronald Rivest, an educator, author, and cryp- 
tographer. 

e Gary Kildall, a computer pioneer in the areas of 
operating systems, programming languages, and 
user interfaces. 

e Erich Gamma, Richard Helm, John Vlissides, and 
Ralph Johnson, authors of the seminal Design 
Patterns: Elements of Reusable Object-Oriented 
Software. 


The recipients of the 1999 Dr. Dobb’s Excellence 
in Programming Awards are no less outstanding 
when it comes to technical innovation and sup- 
port of open communication in the programming 
community. As creator of the Python programming 
language, Guido van Rossum has given software developers a 
tool that addresses many of the shortcomings of more well- 
known and mainstream languages. On the systems side, Donald 
Becker has contributed extensively to Linux’s networking 
code and played a pivotal role in advancing low-cost, high- 
performance parallel computing as the chief investigator of the 
Beowulf Project. 

Python, an interpreted, interactive, object-oriented program- 
ming language, has its roots in a language called “ABC.” (For 
the curious, the moniker “Python” derives from “Monty Python.”) 


Jonathan is editor-in-chief of DDJ and can be contacted at 
jerickson@ddj.com. 
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ABC, which Van Rossum helped develop in the 1980s, was orig- 
inally created to teach novices how to program and as an ef- 
fective tool for occasional programmers. Although it was freely 
available and elegant, ABC never caught on, in part, Van Rossum 
speculates, because of the difficulty in adding new primitive op- 
erations. Consequently, when Van Rossum decided to build an 
interpreter for a new scripting language in 1989, his first design 
decision was to avoid this kind of mistake. 

Still, Python inherits many of ABC’s features that make it an ap- 
proachable language for programmers of all levels. In short, Python’s 
major features include its support for object-oriented development 
and powerful programming constructs, extendible and embed- 
dable architecture, and clear syntax. Python makes it extremely 
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easy to build complex data structures out of objects, lists, dictio- 
naries, and the like. It is particularly useful for system administra- 
tion, building GUIs, scripting, database programming, and rapid 
prototyping. Python is portable, running on UNIX, Windows, Mac- 
intosh, Amiga, BeOS, and other systems. And it is freely available. 

Van Rossum started developing Python while working at CWI, 
the National Research Institute for Mathematics and Computer 
Science in the Netherlands. While there, he worked on the Amoe- 
ba project, a distributed operating system that was the brain- 
child of Andrew Tanenbaum and jointly developed by CWI and 
the Computer Systems Group of the Department of Computer 
Science of the Free University of Amsterdam. It was at this time 
that Van Rossum started Python development. 

Van Rossum is currently a group leader and system architect 
for the Corporation for National Research Initiatives (CNRD, a 
nonprofit organization in the U.S. that undertakes research and 
development for the National Information Infrastructure. At CNRI, 
he is working on a system for mobile agents called “Knowbots” 
(http://www.cnri.reston.va.us/home/koe/) that uses Python as 
its main programming language. CNRI currently supports Python 
development, including coordinating the Python Software As- 
sociation (http://www.psa.org/) and Python Consortium 
(http://www.python.org/). Van Rossum is also the coauthor (with 
Aaron Watters and Jim Ahlstrom) of Internet Programming With 
Python (IDG Boooks, 1996). Much of his recent interest involves 
JPython, a complete Python implementation written in 100 per- 
cent pure Java which compiles Python source code directly to 
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Java bytecode. The resulting class files can be run in any brows- 
er that is JOK 1.1 compliant. 

One of the challenges in the realm of scientific computing is to 
efficiently and affordably handle large data sets. This is precisely 
the problem faced by researchers participating in the Earth and 
Space Sciences Project at the Goddard Space Flight Center. To tack- 
le the problem, Donald Becker and Thomas Sterling launched the 
Beowulf Project (http://beowulf.gsfc.nasa.gov/beowulf. htm), a 
cluster computer consisting of high-performance PCs built from 
off-the-shelf components, connected via Ethernet, and running 
under Linux. Ultimately, the goal of the Beowulf approach was to 
achieve supercomputer (gigaflop) performance at PC prices. 

To implement such a system, however, Becker, who is a staff 
scientist with the Center of Excellence in Space Data and Infor- 
mation Sciences (or CESDIS, part of the University Space Research 
Association, a nonprofit consortium of universities that sponsors 
space-related research), had to come to grips with Linux’s un- 
stable networking capabilities, and the lack of Linux support for 
off-the-shelf network cards. Consequently, Becker ended up writ- 
ing enhancements to the kernel network subsystem to support 
faster I/O on high-speed networks, device drivers for countless 
Ethernet cards (see http://cesdis.gsfc.nasa.gov/linux/drivers/ 
index.html), and a distributed shared memory package. 

Becker wasn’t a stranger to Linux, networking, or parallel com- 
puting when he launched into Beowulf, however. After receiv- 
ing a degree in electronical engineering and computer science 
from the Massachusetts Institute of Technology, he worked for 
Harris Corp. as an engineer performing parallel com- 
puting research. From there he moved to the Insti- 
tute for Defense Analysis’s Supercomputer Research 
Center where he first encountered Linux and its lack 
of network support. Then in 1994, Becker joined 
CESDIS where he began his Beowulf work. 

Although much of his initial work was in support 
of Beowulf, the entire computing community ulti- 
mately benefited from Becker's efforts. Linux would 
not have achieved the level of success and accep- 
tance it has today had it not been for Becker’s work, 
which resulted in a Linux with robust, stable net- 
working and support for “every shipping Fast Ether- 
net chipset.” As for Beowulf, dozens of university and 
research groups have now built their own Beowulf 
clusters, ranging from the original 16-node cluster 
running on Intel DX4 processors connected by 
channel-bonded 10-Mbits/sec Ethernet, to Avalon, a 
19-gigaflop cluster of 140 Alpha processors that was 
built by the Los Alamos National Laboratory and that 
cost only $150,000. 

Along with other members of his team at Ex- 
cellence in Space Data and Information Sciences, 
Becker was the recipient of the IEEE Computer 
Society 1997 Gordon Bell Prize for Price/Perfor- 
mance “in recognition of their superior effort in 
practical parallel- processing research.” Becker is 
the coauthor, along with Thomas L. Sterling, John 
Salmon, and Daniel F. Savarese of the recently 
published How to Build a Beowulf: A Guide to the 
Implementation and Application of PC Clusters 
(MIT Press, 1999). 

Please join us in honoring Guido van Rossum and 
Donald Becker. Once again, they remind us that a 
mix of technology, innovation, vision, and coopera- 
tive spirit continue to be fundamental to advance- 
ment in software development. 


DDJ 





Some fifty years ago, one expert predicted 
there would be a world market for about five computers. 
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The Dell Precision Workstation. 
For people who believe in possibilities. 
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.., Customize online with 
DELL PRECISION’ WORKSTATION 610 starting at $3199 FEATURES GRAPHICS CARD 4 E-VALUE CODE: 88908-490531_ 


Diamond Permedia 2 8MB AGP (Upgradeable to: Appian Jeronimo Pro, E&S 
AccelGALAXY™ 31, 3D/abs Oxygen™ GMX 2000, Intense3D™ Wildcat™ 4000) 





Up to dual Pentium® III Xeon™ Processors from 


Business Lease” 64MB up to 2GB 100MHz ECC SDRAM 
500 - 550MHz (RAID Capable) 


$107/Mo., 36 Months 4GB' up to 18GB* (10,000 RPM) Ultra-2/LVD 
SCSI Hard Drives 


DELL PRECISION WORKSTATION 410 starting at $2399 





FEATURES 


GRAPHICS CARD 





<& E-VALUE CODE: 88908-4905232 


Diamond Permedia 2 8MB AGP (Upgradeable to: Appian Jeronimo Pro, E&S 


4GB‘ up to 18GB* (10,000 RPM) Ultra-2/LVD AccelGALAXY 31, 3D/abs Oxygen GMX 2000, Intense3D Wildcat 4000) 
SCSI Hard Drives 


Up to dual Pentium II Processors from 400 - 450MHz Business Lease”? 64MB up to 1GB 100MHz ECC SDRAM 
Up to dual Pentium Ill Processors from 450 - 500MHz $80/Mo., 36 Months 
(RAID Capable) 






DELL PRECISION WORKSTATION 210 starting at $1999 FEATURES GRAPHICS CARD & E*-VALUE CODE: 88908-490519 
Up to dual Pentium II Processors from 400 - 450MHz Business Lease™ 64MB up to 512MB 100MHz ECC SDRAM Diamond Permedia 2 8MB AGP 
Up to dual Pentium Ill Processors from 450 - 500MHz $67/Mo., 36 Months 9.1GB* up to 20GB’ EIDE Hard Drives 





17" (16" vis.) M780 Monitor (Upgrades Available from 17" to 24" Multi-Monitor Capability), Microsoft® Windows NT® 4.0, Integrated 3Com® 10/100 PCI TX NIC with Remote Wakeup, 40X Max’ Variable 
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esign theory to a real-world test 


Mark Coats, Mark McCloskey, and Theo Molla 


he article “Constructing Operational Specifications,” by 
Mark Coats and Terry Mellon (DDJ, June 1995), introduced 
_ the Coats-Mellon Operational Specification (CMOS), a 

- methodology for defining user-based scenarios that rep- 
resent a complete and accurate model of system behavior. For 
the most part, the article focused on the methodology’s con- 
structs and usage, using an automated-teller machine exam- 
ple to show how the method works, producing diagrams that 
could be transitioned to an object-oriented and/or structured- 
analysis model. In this article, we'll describe how CMOS has 
been used since then on real-world projects. 

The Mayer Receiver Project, for instance, is a hardware/software 
development project at the Motorola Space Systems and Services 
Division (SSSD). The purpose of the project is to provide secure 
communications. The Mayer receiver is a specialized processor of 
message packets in a communications system. This small project 
currently includes 21,000 lines of integrated C++ and Java code 
and is being updated to add another 10,000 lines. The software 
controls the hardware to allow data input, then processes that data 
for dissemination. The software is controlled by a Graphical User 
Interface (GUD. The CMOS method was used to synthesize, ana- 
lyze, and validate the software requirements for the receiver. 





The Operational Specification 

Before discussing the impacts of the method during the devel- 
opment of the Mayer receiver, we'll provide a quick overview 
of CMOS. Refer to the original article for more details. 

The operational specification is a set of diagrams that speci- 
fy incoming stimuli via actor events, and a system’s response to 
these stimuli. An actor event is an occurrence initiated by an ac- 
tor at some point in time. The operational specification consists 
of diagrams that divide behavior into a set of actor events and 
system responses to those events. The diagrams produced by 
the operational specification are pure analysis models. They ad- 
dress analysis-phase, system-level behavior only and do not 
specify data, design, or implementation information. 


The authors are software engineers for Motorola. Mark Coats can 


be contacted at p26728@email.mot.com. Terry Mellon, CMOS 
coauthor, can be contacted at mellon@seex.com. 
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The CMOS Diagrams 
The CMOS method produces three diagrams: 


e The actor diagram. 
e The actor-event diagram. 
e The event-response diagram. 


The actor diagram (Figure 1) shows actors that initiate and/or 
receive events. Actors can be human or nonhuman entities. Each 
actor event is plotted on the actor-event diagram. The actor- 
event diagram (Figure 2) shows how actor’s events are related 
to each other in time. It is similar to UML sequence diagrams 
with more detail. System scenar- 
ios can easily be extracted from 
this document. Scenarios extract- 
ed from this diagram are consid- 
ered system-level scenarios be- 
cause each event is caused by a 
system-level actor action. Each ac- 
tor event may have a set of sys- 
tem responses (we call them “re- 
sponse bubbles”) defined by 
event-response diagrams (Figure 
3). Scenarios extracted from this 
diagram are normally considered 
software-level scenarios because 
the responses to actor events are 
usually implemented with soft- 
ware. These three diagrams work together to produce a com- 
plete specification of the scenarios that define a system’s be- 
havior, both at the system and lower levels. 

Validating behavior represented in an operational specifi- 
cation involves tracing scenarios through the diagrams start- 
ing with an actor event and continuing through the event's 
corresponding system responses. The ease with which this 
validation process can be performed lets systems engineers, 
software engineers, domain experts, and customers help val- 
idate system requirements early in the development process. 
The scenarios are also excellent tools for the development of 
system-integration test cases that can be inserted directly into 
a test procedure document or plan. 


The Mayer CMOS Process 

The Mayer development team was initially given a sketchy set 
of system requirements for the project. From that, the team spent 
about eight weeks analyzing the system and its operation to 
produce CMOS diagrams. A textual requirements document was 
constructed while creating the diagrams to convey requirements 
that were not behavioral in nature. A GUI drawing was also 
constructed to help represent graphical components mentioned 
by textual descriptions in the diagrams. (This was not required 





Figure 1; CMOS actor diagram for ATM. 
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CMOS actor events are recorded 
in sequence on actor-event 
diagrams 





by the CMOS method but is recommended for software that uses 
a large GUI.) After completion, the CMOS diagrams were used 
to produce a test-specification document. Additionally, the team 
estimated the development effort for each response bubble in 
hours. That data was used as a tool to monitor development ef- 
forts and report status. 


Mayer Actor Diagrams 

The actor diagram was used to determine sources of events 
for the Mayer receiver system. It was the first CMOS diagram 
created and helped define the actors of the Mayer software. 
Besides defining the software actors, the diagram helped to 
discover which users would ini- 
tiate events, send and receive 
events, or only receive events. 
Figure 4 is an example of this di- 
agram. Once defined, the dia- 
eram was frequently used as a 
reference throughout the rest of 
the CMOS modeling process. 


Mayer Actor-Event Diagrams 
CMOS actor events are record- 
ed in sequence on actor-event 
diagrams. Actor events are used 
to define system behavior in 
greater detail. Actor-event dia- 
grams are similar to use case se- 
quence diagrams but provide more detail. Figure 5 is a sam- 
ple of actor-event diagrams for the Mayer receiver system. 
The diagram reads from left to right across a series of sequences. 
The left-most event on sequence one starts a scenario. Many 
scenarios can branch off of a single event. Scenarios proceed 
to a final sequence on the right, then repeat back to other 
sequences. 

The actor-event diagram was most useful to systems engi- 
neers and the customer because it provided a simple, high- 
level picture of how the system worked before it was con- 
structed or even designed. The diagram allowed developers 
and systems engineers to play the role of an actor by tracing 
through the diagram. It was also helpful to sketch out a GUI 
while building this diagram. When tracing through the dia- 
gram, engineers referenced the GUI that corresponded to each 
CMOS actor event. The customer also had no problem con- 
necting the GUI to a sequence of actor events. More impor- 
tant, early role-playing allowed the development team to 
demonstrate the basic behavioral concepts of the system to 
our nontechnical customer in an understandable manner. The 
customer was able to suggest changes that made a significant 
difference in their satisfaction with the product prior to any 
development being completed. 





Figure 2: CMOS sample actor-event diagram for ATM. 
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(continued from page 24) 

Mayer Response Diagrams 

The response diagrams were the most useful to software de- 
signers and implementers. Figures 6 and 7 are two samples of 
the 71 response diagrams that provide the details of how the 
software responds to the Mayer receiver actor events. 

Each response diagram corresponds to an actor event defined 
on the actor diagram. Figure 6 represents the software responses 
to the actor event number 150, “Operator presses stop button 
for ChX.” Figure 7 represents the software responses to actor 
event number 90, “Monitor starts program.” The circles repre- 
sent software responses that occur along possible scenario paths 
responding to the event. The colors (shades of gray in the black- 
and-white version) were added to represent 12 design domains 
that were created after the CMOS model was completed. 


Lessons Learned 
From the Mayer receiver project, we learned a number of things 
about the use and usefulness of the CMOS methodology. 


Numbering the Events and Responses. Even though ac- 
tor events and responses have textual descriptions, it is helpful 
to attach number identifiers to them. This approach was not sug- 
gested in the original CMOS article. These numerical identifiers 
are helpful in that they: 


e Make it easier to trace from actor events to system responses. 

e Allow for tracking of work accomplished using actor events 
and responses. 

e Let programmers easily identify actor events and responses in 
the code. 

e Make it easier to reference actor events and responses when 
building test procedures or scenarios. 


The numbering process uses intervals of 10 to allow insertion 
of future actor events or responses. The interval is arbitrary and 
can be any value; however, we recommend at least 10 to allow 
for future growth and modifications. 

Splitting the Product into Domains. The first step in the 
software-design process was to compartmentalize the effort into 
application and service domains. These domains provide a set of 
related classes to perform a subset of the system functionality. By 
compartmentalizing the system in this manner, individual devel- 
opers were able to take complete responsibility for a portion of 
the system. Additionally, it allows for prototyping and testing por- 
tions of the system without needing the entire system. 

To do this, we studied the CMOS diagrams and decided how 
we could best group the many behaviors into some manage- 
able grouping of applications to perform the system task. We 
identified 12 domains and assigned individual team members to 
each domain. 

The key element of this effort was a domain communication 
model that was developed primarily from the CMOS model. The 











CardUser 


















g Query 
Bank for 
Enters TransType Balance 


Trans Type 
=Balance 






Trans Type= 
Withdrawal/Deposit 






Ask Display 
to display a 
request 
for amount 






Figure 3: CMOS sample system-response diagram for ATM. 
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model uses color coding to easily identify each domain. Figure 
8 is an example of the Mayer domain communication model. 

Overlaying Domains onto the Response Diagrams. Even 
though CMOS is a purely analytical method, we found that it 
also has the capability to reveal preliminary design information. 
We decided to indicate which domains participated in each 
CMOS response bubble so that developers would know for 
which ones they were responsible. A domain is a set of related 
classes. We used the domain’s color code to overlay participat- 
ing domains on response bubbles, making more visible which 
domains were participating in each bubble. In this process, we 
found that some bubbles had multiple domains participating. 
This is not too surprising because the bubbles represent be- 
havior and the domains represent structure. 

Next, the interfaces between the domains had to be defined 
before any serious domain-specific design could take place. We 
set out to build a domain communication model and quickly 
discovered that because we had identified each domain partic- 
ipating in a bubble, the bubbles with multiple domains (colors) 
revealed key locations in the design structure where messages 
would be passed (communication interfaces) between those do- 
mains. Essentially, the CMOS response diagram with the over- 
lying domains showed exactly where the domains needed to 
communicate. This was a valuable bridge that helped us to re- 
late the behavior in the CMOS model with the structure in the 
design’s domain model. 

Most design methodologies, including the Rational Unified 
Process and Shlaer-Mellor, suggest that a domain-level com- 
munication diagram be constructed to show the message com- 
munication between domains. CMOS helped this effort by pro- 
viding domain communication paths early in the preliminary 
design phase. These communication paths were based on pure 
analysis of the system’s behavior from the actor’s perspective. 
The communication paths may not be complete, but should be 
correct. Completeness will come later because some of the do- 
main communication paths will be established based on design 
or implementation considerations. 

Tracking Progress with Response Diagrams. The response 
diagrams were used to track development progress— specifi- 
cally design, implementation, and testing efforts. Developers 
were assigned to the domains that overlaid the response dia- 
grams. They were asked to estimate the hours it would cost to 
implement their portion of each bubble. Each bubble and its 
estimated hours were placed into a spreadsheet. (A total of all 
bubble’s estimated hours was also a good source for costing 
the project.) As a developer worked on a bubble, he would en- 
ter a percent complete. Multiplying the percent complete by 
the estimated hours for each bubble determined hours spent 
and remaining for each bubble. Adding up all of the hours for 
each bubble determined hours spent and remaining for all of 
the bubbles combined. This allowed for an accurate percent- 
complete value that could be entered into our earned value 
system every month. 





Figure 4: Sample of the Mayer actor diagram. 


Dr. Dobb’s Journal, June 1999 





















LEADIng 
Technology 
in imaging 
Development 
Toolkits. 


Just an Ad 
is Not Going 
to Cut it! 


With over 1000 
features, more than 
any other toolkit on 
the market, visiting 
our website is the only 
way you can see just 
how powerful this 
award winning imag- 
ing toolkit is! 






Hit the web 
and check out: 


IMAGE PROCESSING 
SCANNING 
COLOR CONVERSION 
DISPLAY/SPECIAL EFFECTS 
ANNOTATIONS 
COMPRESSION 
IMAGING COMMON DIALOG 
INTERNET/INTRANET 
DATABASE 
OCR 
SCREEN CAPTURE 
PRINTING 
MULTIMEDIA 
MEDICAL 
FLASHPIX 
JBIG 





LEADTOOLS is available in several versions, 
not all features are available in all versions. 
*License required from Unisys for formats using 
LZW compression. LEAD and LEADTOOLS 
are registered trademarks of LEAD 
Technologies, Inc. All other product names are 
trademarks of their respective owners. 


MULTIMEDIA 
DOCUMENT 
MEDICAL 


VELOPMENT 


FILE FORMATS 
- MORE THAN 50 - 
MOST COMPREHENSIVE 
SUPPORT AVAILABLE 
AND LOSSLESS JPEG! 


JPEG §=s_:« LOCA DIB. =~ PCT 
TIFF MODCA WK CMP 
DICOM CAL MAC BMP 
FPK ico VDA AWD 
EXF CUR GIF |= WME 
PSD PCK PNG EMF 
PCD DCX TGA WPG 
EPS IMG RAS AVI 


For the FULL list of 
File Formats and Features, 
please visit our website: 


LEAD OOLS supports both 16 
and 32-bit development 
environments, and ships with sample 
source code for Visual Basic, C/C++, 
Visual C++ (MFC), C++ Builder, 
Visual J++, Visual FoxPro, Access, 
Delphi, and VB and Java script. And 
NEW support for Visual Studio 
database connectivity using OLE DB 
(JET, ODBC, Oracle and SQL 
Server) 


includes Free Technical 
Support 





800-637-1840 
30-DAY MONEY BACK GUARANTEE 








Creating Test Procedures. The CMOS actor-event and 
software-response diagrams allowed us to create test cases quick- 
ly and efficiently. Each actor event was listed numerically in a 
table. Figure 9 is a sample of the test case matrix for actor event 
number 90, “Operator starts Program.” The first two columns 
track the test case to the CMOS actor event and its possible re- 
sponses. Notice that under the “Software Scenarios” column each 
possible path (defined by response bubble numbers) is defined 
as a test case for actor event number 90. This allowed us to test 
the most probable scenario paths based on user behavior. Re- 
peated paths were avoided to minimize the propagation of re- 
dundant cases. This was strictly an integration, black box test 
because it was based on behavior paths and not code paths. 
The entire set of test cases for every actor event was repeated 
three times; once for software integration testing, once for hard- 
ware integration testing, and finally for customer qualification 
testing. Using the CMOS model to produce these test cases was 
a great improvement in test case production. Also, the test cas- 
es were produced just after the CMOS diagrams were complet- 
ed and just before the design phase of the software began. By 
reviewing the test cases this early in the development cycle we 
were able to revalidate the requirements and reaffirm that our 
design was on target. 
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selection 
window 
is open 
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Placing Response Bubbles into the Code. The “Results” 
column in the test matrix for actor event 90 (Figure 9) is actu- 
ally a directory. The directory contains a file for each of the 12 
domains defined for the Mayer receiver software. Each file in- 
cludes printouts of CMOS bubble numbers that were actually 
traversed when the software was run for that test case. This was 
accomplished by having each developer code an output state- 
ment that printed out the bubble number whenever the func- 
tionality of the bubble was executed by the code. This was ex- 
tremely useful in determining if each test case actually followed 
the CMOS response path expected. It was also useful in re- 
gression testing because we could simply compare these files 
to later test case results to see if anything had changed. This 
technique was also valuable for locating errors in the code dur- 
ing the test. The test case would reveal which CMOS bubble 
did not work correctly. Developers would search (using a grep 
routine) for that bubble in the code. This process would lead 
developers to the exact location of the error. 

Create a Drawing of the GUI Interface during Model- 
ing. From the beginning of the modeling process it was 
helpful to draw GUI interface components as the behavior 
was being discovered. These drawings became invaluable 
when communicating behavior between developers and to 
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the customer using the CMOS diagram. They were like vi- 
sual aids to the model. It was not necessary to spend a great 
deal of time drawing these components. We used Microsoft 
Word to draw simple specifications of the components. It 
was a simple task to then build the real GUI components 


Label GUI Components. Labeling GUI components with infor- 
mative titles allows references to associated titles in the model. An 
example of this is circle 90.190 (Figure 7), which states “display xmit 
assignment file not loaded modal dialog box.” The “xmit assignment 
file not loaded modal dialog box” is a title for a dialog box that was 


from the drawings. 
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Figure 6: Operator presses stop button for chX response diagram. 
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Figure 7: Operator starts program-response diagram. 


(continued from page 29) 
drawn in the GUI diagrams. The advantage of this approach is that 


[Task the dialog box itself can change its graphical appearance without 
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having to change the CMOS model. 

Trace CMOS Scenarios at CDR. The Critical Design Review 
(CDR) for this project was a major success due in part to the 
CMOS method. We reversed the model into text scenarios and 
presented them along with the diagrams that produced them. 
The customer had no software technical background and yet 
could easily read and understand the specified behavior. This 
let him make intelligent suggestions as well as further validat- 
ing the behavior. An actual quote by the customer was “I have 
never seen a software CDR like this where I paid attention to 
every cell and I learned from it.” 

Usage with UML. We initially used Jacobson’s Use Cases 


Figure 8: Domain model created from CMOS. FileProc, light to help discover and record behavior for the Mayer receiver 


green; EdtaMgr, black; TaskMgr, red; DiskMgr, gray; Channel, _ software. Use cases provided the same level of information as 


brown; GUI, dark green; Timer, cyan; Log, blue; Receiver, the actor-event diagrams. Use cases and the actor-event dia- 
yellow; MessProc, purple; Formatter, pink; Output, orange. grams were useful in providing clarity for user behavior. The 
32 Dr. Dobb’s Journal, June 1999 


10-20-30-40-90-100-110- 
150-160-170-230-180 


10-20-30-40-90-100-110- 

150-160-200-250-255 

10-20-30-40-90-100-1 10- 
150-1 10-200-150-180 
10-20-30-40-90-100-110- 
150-160-200-240-180 





Figure 9: Test plan and procedures matrix for actor event 90. 


use cases and supporting interaction diagrams, however, did 
not provide enough information about how the software should 
respond to actor events. The CMOS system-response diagrams 
provided this information. After actor events were defined, the 
system-response diagrams were used to help bridge the soft- 
ware behavior to the software design (see “Overlaying the Do- 
main Model onto the Response Diagrams”). Using use case di- 
agrams by themselves would have been much more difficult. 
Following the construction of the domain model, the develop- 
ment team used UML object models and state diagrams to com- 
plete the design of the software. 

It is interesting that the CMOS actor-event and response dia- 
grams had some correlation with the UML state diagrams. For 
instance, we could map response bubbles that communicated 
between domains with the states that sent messages between 
those domains. All of the events in the actor-event diagrams 
mapped directly to events occurring in the state models. It was 
a simple matter to validate the state models by tracing events 
and messages back to the original behavior defined in the 
CMOS diagrams. 

Suggested Improvements. Even though CMOS claims that 
it does not need an elaborate CASE tool, it would have been 
helpful to have some kind of mechanism to link actor events 
with responses. This capability became more important as the 
specification grew in size. A tool such as Visio, which pro- 
vides this linking mechanism, would be better than Word for 
a larger specification. It also became cumbersome to try and 
draw the domain colors on the bubbles, especially when there 
were multiple domains on a bubble. These problems were 
minute compared to the benefits of using the method; how- 
ever, it would be nice to see a CMOS-specific tool developed 
in the near future. 


Conclusion 
The specification produced by CMOS was invaluable throughout 
the software-development process. There were many occasions 
when we would reference the model to look up important be- 
havioral and analysis data. Finding this data was easy using the 
model. Without it, we would have spent many extra hours writ- 
ing and looking up textual representations for the same data. This 
alone accounted for a three-fold improvement in cycle time. 
The model helped to greatly improve test-case development 
productivity. The test cases were developed early in the develop- 
ment cycle and were used to help validate the requirements be- 
fore the design phase began. The model was also used as a vali- 
dation tool when the customer added or changed requirements. 
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Interfaces between software domains were easily identifiable 
after overlaying the domain structure onto the CMOS response 
diagrams, saving a great deal of time and providing an accurate 
communication model. 

Progress tracking of the development effort was made pain- 
less by both management and developers because the subjec- 
tivity was taken away. We tracked project tasks that, because of 
CMOS, had been validated by management and customers. De- 
velopers only needed to know which bubbles they were work- 
ing on and at the end of each week, filled in the percent com- 
plete on a simple spreadsheet. 

Our customer was delighted with the specification because it 
was easy to read and communicated the expected software be- 
havior in a clear and precise manner. Before a new or changed 
requirement was approved, it was first added to the CMOS mod- 
el to make sure that it would work correctly with the existing 
system. This allowed developers to quickly determine if a new 
requirement was feasible. 

The CMOS method is easy to learn and does not require a host 
of CASE tools to implement. Most any drawing tool can handle 
the CMOS modeling components. A mechanism to help link ac- 
tor events with responses would have been helpful. Several future 
projects for the Mayer receiver are currently planning to use the 
CMOS method. For instance, the SSSD tools working group is us- 
ing CMOS to help define how configuration and requirements 
management tools interface with various users. 

We believe that this method can be used by anyone, and the 
time to incorporate the method is minimal, while the benefits 
are numerous. 
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ava is a tool that enables truly 

portable applications. This is espe- 

cially important in the world of e- 

commerce, where heterogeneous sys- 
tems are the norm. But having a portable 
language is not enough to ensure that 
your code behaves the same across all 
the systems. This is because, eventually, 
you will run into some subsystem— a 
database, for instance — that is non-Java 
and behaves differently on different op- 
erating systems. Quite often you'll find 
that sticking to database calls that are 
only in the ODBC specification is not 
possible. You might want to take advan- 
tage of a particular database function that 
is not part of ODBC and so will not work 
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Design 


on all databases. Something as simple as 
DB2’s SELECT DISTINCT, which returns 
only distinct values (no duplicates) is not 
part of the spec and so is not portable 
across database vendors. 





Another area of concern for portabil- 
ity involves using Java’s Unicode char- 
acter set with a nonUnicode database or 
on a platform that uses double-byte char- 
acters. These scenarios require that your 
Java application be aware of these sub- 
system differences. If you want to en- 
sure that the data user’s input will fit in 
a column of the database, you need 
more information than just the column 


length. Since a database column length 
of CHAR(32) is really 32 bytes (not nec- 
essarily 32 characters), you need to know 
how many bytes the database needs to 
store both single- and double-byte char- 
acter strings. 

In this article, I'll discuss the use of fac- 
tory classes, which I’ve found to be an ef- 
fective design for solving these and other 
platform-dependent problems. Factory class- 
es keep the application code unaware of 
the platform it’s running on, while making 
porting to new platforms straightforward. 

The application my team built—IBM 
Net.Commerce Product Advisor— is an 
e-commerce catalog search engine writ- 
ten entirely in Java. It uses a relational 
database with both local and remote Java 
DataBase Connectivity DBC) for all its 
database access. Java Servlets are used to 
provide web server-side functionality that 
renders information to the client browser. 
It runs on five different operating systems, 
two of which use Extended Binary Cod- 
ed Decimal (EBCDIC) character encoding, 
and in 10 national languages, four of 
which use Double Byte Character Strings 
(DBCS). It would have been nice if Java, 
with its JOBC and Unicode support, had 
masked all these differences from the ap- 
plication, but the reality is that this is sim- 
ply not possible. 

For any Java application to be truly 
portable, you need to find the subsystems 
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(continued from page 34) 

that are outside the Java environment 
and define and encapsulate the behav- 
ior of those systems. The problem is that 
new subsystems might be added later, 
so you need to plan for this type of ex- 
pansion. Good design is the most im- 
portant factor in building any applica- 
tion, and good design starts with a 
thorough definition and analysis of the 
problem you're trying to solve. When 
doing analysis for any system, the first 
thing you need to define is the bound- 
aries of the system. What’s inside it? 
What’s outside of it? What does the 
boundary behavior between the inside 
and outside look like? A key success fac- 
tor in this work is a good set of pro- 
gramming objectives. One of the key ob- 
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al. Addison Wesley, 1995). This is poly- 
morphism beyond what normal sub- 
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useful when the decision of which class 
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to use must be done at run time and can- 
not be hard coded during development. 
Factory classes encapsulate the logic 
needed to decide which subclass to in- 
stantiate and so removes this decision 
from the application, delegating it to the 
factory. Using Java’s dynamic class load- 
ing, you can build a system that can be 
extended with new classes without hav- 
ing to modify or recompile the original 
application. This is usually accomplished 
by following a naming pattern that uses 
some type of information to predict the 
name of the subclass needed and dy- 
namically load it. 

One example use of Factory classes 
was when we instantiated a Category ob- 
ject in our electronic catalog. In the 
Product Advisor application, an e-com- 
merce catalog of products was grouped 
into categories that could be traversed 
to find a product. Figure 1 presents an 
object model for the category relation- 
ships. A Catalog is stored in a DataStore 
and is composed of a collection of one 
or more Categories, which may contain 
other categories and/or products. Prod- 
ucts are defined by a collection of fea- 
tures. If you ask a category for its prod- 
ucts, it returns a collection of products 
from that point in the tree on downward. 
So if you ask a high-level category, 
which only contains other categories for 
its products, you must recursively tra- 
verse the tree, asking each subcategory 
for its products to get the complete list 
of products from that point in the tree 
downward. 

Some databases, such as IBM’s DB2 
Universal Database V5 (DB2 UDB35), have 
defined a recursive query syntax for solv- 
ing this classic “bill of materials” problem. 
This syntax is not part of ODBC and will 
not work with other databases. This be- 
havior could have been coded to the low- 
est common denominator to be ODBC 
compliant, but we wanted DB2 customers 
to get the performance benefit of the built- 
in recursion. 

Design Patterns discusses Factory meth- 
ods, but assumes there is a logical object 
to place the method in. If no such object 
exists in your design, you can use a Fac- 
tory class. The sole purpose of Listing 
One (the source code for the factory 
class, CategoryFactory) is to instantiate 
the proper Category object based on the 
type of database you are using. So if you 
wanted to support both DB2 and Oracle, 
you would define a DB2Category and 
OracleCategory — each a subclass of Cat- 
egory and each having the proper query 
syntax for their database. The factory uses 
information stored in the DataStore 
(which represents the physical database) 
to determine which class to instantiate at 
run time. 
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In short, there are several design points 
about factory classes: 


e The class that’s returned by the factory 
class is a subclass of the type of class 
you actually want (CategoryFactory re- 
turns the proper subclass of Category, 
for instance). 


Factory classes are 
used to get the right 
implementation 
of an abstract 
base class 


e The class must have a default con- 
structor (a constructor without parame- 
ters) so that it can be dynamically in- 
stantiated. 

The class needs to have access meth- 
ods to set other needed properties be- 
cause of the previous point. 

The call syntax to the factory class 
should be the same as if you had cre- 
ated the class with the new operator. 
The constructor for the factory is pri- 
vate because there is never a need to 
instantiate this class. It’s just a utility class 
with a static method for creating the cor- 
rect subclass. 

Instantiation errors should result in re- 
turning a null object to indicate that an 
object could not be instantiated. Never 
return a partially instantiated object. 


What is the significance of these de- 
sign points? Factory classes are used to 
get the right implementation of an ab- 
stract base class. The classes that are re- 
turned are always subclasses of the base 
class you need. When you dynamically 
instantiate a class by name, the default 
constructor is called by the loader. The 
default constructor is a constructor that 
has no formal parameters. Because of 
this, only classes that have default con- 
structors can be dynamically instantiated 
in this fashion. If other parameters must 
be set before the object can be used (the 
object shouldn’t have a default con- 
structor), make the default constructor 
package level scope. This lets the facto- 
ry class, which is in the same package, 
instantiate it, but not allow other classes 
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outside of the package scope to instan- 
tiate it. They must go through the facto- 
ry. The factory class should set the oth- 
er parameters before returning the object 
so you know that clients will always get 
a fully instantiated object. In our exam- 
ple, a Category should not be instanti- 
ated without it knowing what catalog it 
belongs to. This is why we call the set- 
Catalog( ) method before returning the 
object (see Listing One). Using a factory 
like this has the same effect as if you called 
a constructor such as Category(Catalog). 

The reason I suggest making the call to 
the factory class the same as if you would 
have instantiated it yourself is to minimize 
the impact of adding new factory classes. 
When using factory classes to support mul- 
tiple heterogeneous environments, it 


is composed of 


+parent 


may contain 


defined by 


Figure 1: Simplified catalog object model. 
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would be nice if you knew all the differ- 
ences before you start, but invariably you 
will be well down the implementation path 
when you find something new that you 
didn’t provide a factory for. If you keep 
the signatures the same, the changes to 
your code will be trivial. 

For example, before knowing you 
needed different versions of the Prod- 
uct class, assume you instantiated a Prod- 
uct with: 


Product prod = new Product(Category); 


Then you discover that you need to im- 
plement Product differently on a particu- 
lar database. No problem, you create a 
factory for Products and change every call 
to new Product(x) into ProductFactory.cre- 
ateProduct(x) and you have: 


stored in 


+child 


may contain 





Figure 2: Abstract category and subclasses. 
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Product prod = ProductFactory 
.createProduct(Category); 


Several times we came across the need 
for a new factory for classes we had al- 
ready implemented. This substantially 
lessened the amount of code change 
needed. 

Since the purpose of the factory class 
is to instantiate other classes and never be 
instantiated itself, you should make the 
default constructor private. There is no 
harm done if you don’t, but I’ve seen pro- 
grammers instantiate a factory object, then 
call its static methods. By making the de- 
fault constructor private, their code won't 
compile, warning them that they don’t 
need to waste any execution time or mem- 
ory instantiating a factory class. Finally, if 
anything goes wrong during dynamic in- 
stantiation, it’s a good idea to return a null 
object so that there is no confusion that 
this object should not be used. The most 
common thing to go wrong is not being 
able to dynamically instantiate the object. 
There have been times when the object is 
created correctly, but setting one of the 
needed parameters fails. In this case, you 
should return a null object because the 
object could not be fully instantiated. 

Using a naming convention to construct 
the correct object makes things straight- 
forward. In the case of the Category class, 
a properties file specifies the database type 
that’s returned by DataStore.getPrefix(). 
This can be either DB2, DB390, DB400, 
or Oracle. The Category class itself is the 
abstract base class that defines the be- 
havior of a Category. All of the common 
code is placed in this class. Unique code 
is placed in abstract methods that the sub- 
classes must implement. We use the name 
of the database in the properties file as a 
prefix for the class name. So for DB2 we 
need to implement a DB2Category class; 
for Oracle, an OracleCategory class as in 
Figure 2. The factory simply prepends the 
database name to the class name and dy- 
namically instantiates the class by name 
(see Listing One). 


category=(Category)Class.forName 
(className.toString()).newInstance(); 


You might ask, “Why not just code, if 
DB2 then this, else if Oracle then that?” 
Herein lies the extensibility of the facto- 
ry design. If you hard coded if-then-else 
logic, you’d have to modify the code to 
add a new database. Because the facto- 
ry can assemble the name of the class, 
you can add support for a new database 
without modifying any code. If, in the 
future, you need to support Informix, 
you implement an InformixCategory 
class, place the value “Informix” in the 
properties file, and at run time the fac- 
tory will instantiate the new class. No 
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change is needed in the factory class or 
any classes that use Category classes. This 
also makes it very easy to figure out 
what’s needed to extend the system to 
support a new database. Just count the 
number of factory classes that represent 
persistent objects and those are the ones 
you need to provide. 


Factory Classes and NLS 

National Language Support (NLS) is an- 
other portability issue. Applications 
should not need to be aware of platform- 
specific NLS concerns. While Java pro- 
vides a consistent framework for NLS 
across operating systems, there is no 
guarantee that the underlying persistence 
mechanisms won't have their own quirks. 
One of these is the difference between 
double-byte character support across 
ASCII and EBCDIC databases. Java sup- 
ports Unicode, so all characters in Java 
are double byte. This may lull you into 
a false sense of security about not hav- 
ing to worry about double-byte charac- 
ters. When storing character strings in a 
database that doesn’t support Unicode, 
however, you still need to be concerned 
about the number of bytes a character 
will need in the database. 

For instance, say you have a database 
column LASTNAME that is defined as 
CHAR(32). This means you can store up to 
32 single-byte characters. If, however, your 
application is being used in a double-byte 
country and your database doesn’t support 
Unicode, you can only store 16 double- 
byte characters. If this was the only prob- 
lem, you could simply divide by two and 
check the length of the string to deter- 


mine, in the GUI of your application, if 


the string entered will fit in the database. 
Unfortunately, EBCDIC systems handle 
double bytes a bit differently than ASCII 
systems. They have special characters 
called “shift-out” and “shift-in” characters 
that mark the start and end of double-byte 
data. This is how the database determines 
if it should use the next byte or two bytes 
to form a character. 

If your application transfers mixed-byte 
data from an ASCII system to an EBCDIC 
system, you have to allow enough room 
for the shift characters. For each switch 
from SBCS to DBCS data, add 2 bytes to 
your data length. To relieve you from wor- 
rying about this, you can use a string- 
length calculator utility class and a facto- 
ry class to instantiate the correct object. 
By using a factory class, you leave the de- 
sign open to adding new string-length 
calculators as you find new systems that 
handle SBCS or DBCS characters differ- 
ently. Also, if a database adds Unicode 
support and the calculation algorithm 
changes, you only have to change your 
code in one place. 


Dr. Dobb’s Journal, June 1999 


StringLengthCalculator 


~ getStringLength() 


SBCSStringLengthCalculator 
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Figure 3 is the object model for the 
StringLengthCalculator class in Listing 
Two. There is an abstract base class that 
defines the behavior for the class. It has 
one method, getStringLength(String str, Lo- 
cale loc), given a string and a Java Locale. 
For the default SBCS implementation, it 
just returns the length of the string (that 
is, return str.length();). For the default 
DBCS implementation, it returns two times 
of the string length (return (str.length() * 





2)). For the DB390 implementation, it 
scans the string and counts how many sin- 
gle byte, double byte, and switches be- 
tween single- and double-byte (shift-in, 
shift-out) characters there are and returns 
that number. 

This factory class operates a bit differ- 
ently from the first one that selected the 
correct database implementation for a Cat- 
egory. In the first type of factory, if the 
proper class wasn’t found, a null object 


Figure 4: Command syntax of the two versions. 
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was returned. In this implementation, the 
factory tries to instantiate the most spe- 
cific class it can and keeps walking up the 
hierarchy to the more generic. This allows 
the insertion of more or less specific im- 
plementations as needed. 


Factory Classes and 

Multiple Version Support 

The final use of factories is to support mul- 
tiple versions of a product where the com- 
mand syntax of other subsystems has 
changed. System boundaries are often a 
good candidate for factory classes. As sys- 
tems outside the boundary change you 
can change the implementation of your 
interface to accommodate it. But what if 
you have to support two versions of an 
outside system at the same time? Having 
two versions of your application is one 
way, but it’s much more desirable to main- 
tain single source. Factory classes are a 
good way to design this. 

A new command syntax was used be- 
tween version 2 and version 3 of Net.Com- 
merce. The OS390 version stayed with the 
old V2 syntax while the NT, AIX, and So- 
laris versions moved to the V3 syntax, so 
Product Advisor needed to work on both 
the old and new versions when con- 
structing a URL that sends a command to 
the server. Once again, we turned to the 
factory class to provide a means of in- 
stantiating the correct command based on 
the version in use. 

The syntax to request the display of a 
product page for both V2 and V3 is shown 
in Figure 4. As you can see, not only is 
the CGI program name different, but the 
command structure (display/item versus 
ProductDisplay) is different. In this case, 
a properties file has a parameter to flag 
the use of V2 or V3 command syntax. This 
lets the factory instantiate the correct com- 
mand to link to the product page. 

If customers wanted to supply another 
way of linking to a product page, they 
could define their own version of the URI- 
CommandLink class (Listing Three) and 
use their own prefix value in the proper- 
ties file and their version would be called 
when a URLCommandLink is needed. 


Conclusion 

Applications that interact with various sub- 
systems invariably encounter differences 
between these systems across various plat- 
forms. A portable language (like Java) and 
good object-oriented design (ike factory 
classes) can be an effective way of en- 
capsulating the differences between sys- 
tems and producing a portable Java ap- 
plication that is truly, “write once, run 
everywhere.” 


DDJ 
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Listing One 


public class CategoryFactory 
{ 
/** Default Constructor */ 
private CategoryFactory() 
{ 
} 


/** Modifier to return the appropriate 


specificClassName.append(dataStore.getDbPrefix()); // database type 
specificClassName. append (dataStore. getByteMode()); // byte mode 
specificClassName.append (className) ; // base class name 
/* Try to instantiate a specific object first */ 

try 

{ 


sle = (StringLengthCalculator)Class. 


forName (specificClassName.toString()).newInstance() ; 


Category object } 


* @param Catalog the catalog this category is in catch (Exception e) 


* @return Category 


*/ /* If that fails, try to instantiate a generic object */ 
public static final Category createCategory( Catalog catalog ) try 
i { 
Category category = null; sle = (StringLengthCalculator) Class. forName(genericClassName. 
StringBuffer className = new StringBuffer ("com.ibm.catalog."); toString()).newInstance(); 
try } 
{ catch (Exception e1) 
DataStore dataStore = catalog. getDataStore(); { 
className. append (dataStore. getDBPrefix()); sle = null; 


className.append("Category") ; 


category = (Category) Class.forName(className.toString()) .newInstance() ; 


category.setCatalog (catalog) ; 
} 
catch ( Exception e ) 


{ 


System.err.println("*** ERROR: CategoryFactory.createCategory() - } 


{ 


System.err.println("*** ERROR: StringLengthCalculatorFactory. 
createStringLengthCalculator() - instantiating " + 
genericClassName.toString() + " from factory"); 

} 
5 


return slc; 


instantiating " + className.toString() + " from factory"); 


category = null; 


return category; 


} /** Method to return the appropriate URLCommandLink object based on 
} cae syntax version 
* 
fF r I public static final URLCommandLink createURLCommandLink(MerchantServer ms) 
if 
? "7 ~— URLCommandLink tmpLink = null; 
public final static StringLengthCalculator StringBuffer className = new StringBuffer ("com.ibm.catalog.") ; 
createStringLengthCalculator(DataStore dataStore) try 
{ { 
StringLengthCalculator slc = null; // the object to be returned className. append (ms. getURLCommandVersion()) ; 
String packageName = "util."; // package name of class className. append ("URLCommandLink") ; 
String className = "StringLengthCalculator"; // base class name tmpLink = (URLCommandLink) Class. 
forName(className.toString()) .newInstance(); 
/* the generic class name is used when there is no specific one */ } 
StringBuffer genericClassName = new StringBuffer (packageName) ; catch ( Exception e ) 
package name 4 
genericClassName. append (dataStore. getByteMode()) ; // byte mode System.err.print1n("URLCommandLinkFactory.createURLCommandLink() - 
genericClassName. append (className) ; // base class name could not instantiate class for " + className) ; 
/* the specific class name is used in special cases where the }. 


generic isn't enough */ 
StringBuffer specificClassName = new 
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Listing Three 


return tmpLink; 
StringBuffer (packageName) ; } 


// package name DDJ 
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Cross-Platform 
Design Strategies 





Designing for more 
than one platform 





Bob Krause 


ross- platform development is not a 

niche specialty. All programmers at 

some point have to ensure that their 

code compiles cleanly and runs effi- 
ciently in multiple environments, even if 
those environments are just different ver- 
sions of the same compiler. 

At NeoLogic, we’ve accumulated a lot 
of experience with this problem. Neo- 
Access, NeoLogic’s cross-platform object 
database, ships with source code, which 
means it must compile and run on a large 
variety of platforms. In this article, I'll sur- 
vey the key aspects of our cross-platform 
architecture that you can use to ensure 
that your feature-rich and extensible code 
can be readily utilized on multiple plat- 
forms. I'll also demonstrate our approach 
by sharing a set of thread classes we’ve 
developed for use on both Macintosh and 
Windows PCs. 


Lowest Common Denominator 

Many developers adopt a lowest common 
denominator approach, only using facili- 
ties that are uniformly available on all plat- 
forms. Any libraries or components you 
want to buy have to be available for all 
platforms in question, which greatly lim- 
its the options. Your application can end 
up looking only as good as the worst plat- 
form supported by the components or li- 
braries chosen. This approach is unac- 


Bob is president of NeoLogic Systems, and 
architect of the NeoAccess cross-platform 
object database engine. Bob can be 
reached at neologic@neologic.com. 
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ceptable to all but the most captive mar- 
kets— the ability to deliver an application 
on multiple platforms should not result in 
a limited set of features. 


Conditional Code 

Another common approach is to start by 
writing an application for one platform, 
then port the application to one platform 
at a time, using conditional code to indi- 
cate differences between platforms. The 
problem with this approach is that, with- 
out a cross-platform architecture in the 


initial implementation, the amount of code 
that can be shared across platforms is lim- 
ited and the maintainability of the source 
tree suffers with each successive port. 


Design Strategies 

In most cases, every platform you'd like 
to support does have the features you 
want, but those features are implement- 
ed in a totally different way. The chal- 
lenge is to isolate those platform-specific 
features and communicate with them 
through an abstraction layer that will 
work for all platforms. This is accom- 
plished by letting the visible interface of 
a platform-specific class define how client 





code accesses a function without regard 
for how the function is implemented. En- 
capsulation is preserved and visible com- 
plexity is reduced. 

It is interesting to note that of the over 
200 classes and templates in NeoAccess, 
fewer than 10 are platform specific. Many 
other well designed applications can ex- 
pect to meet a standard metric: 95 percent 
platform independent, 5 percent platform 
specific. 

The cross-platform design pattern iso- 
lates and encapsulates the implementation 
of platform-specific functions behind a 
platform-neutral interface. Only a small 
portion has to be rewritten for each plat- 
form. Typically, these classes are very 
straightforward— they provide specific 
functionality on a given platform. In many 
ways, this implementation is the easy, even 
boring, part of coding. 

The hard part is designing an interface 
that presents an appropriate environment- 
neutral set of services to client code. The 
interface to these services should be suf- 
ficiently high level to maximize the return 
from the application developer’s effort 
and minimize the effort involved in mov- 
ing application code to additional plat- 
forms. Ultimately, the collection of plat- 
form-specific classes of the cross-platform 
pattern will support all the features any 
platform-independent applications re- 
quire. When you reach that point, all the 
code you write for applications will be 
platform independent, residing on top of 
this collection of classes. 

Our object database engine was designed 
to be cross platform from the ground up. 
This was accomplished by using the cross- 
platform design pattern. I'll explain this pat- 
tern by presenting an exemplar set of thread 
classes that provide multithreading support 
on both Windows and MacOs. 


Cross-Platform Thread Classes 


Multithreading services provided by the op- 
erating system often differ from platform 
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to platform, as does the programming 
interface to those services. In Windows, 
threads are preemptive — another thread 
can preempt a running thread from exe- 
cution, taking control of the processor on 
demand. In the Macintosh, threads are of- 
ten cooperative, only being preempted by 
explicitly yielding execution. 

To make the thread behavior consistent, 
all thread implementations in NeoAccess 
Release 6.0 are cooperative threads— each 
thread yielding control explicitly. While this 
was trivial to implement on the Macintosh, 
it was somewhat more complex on the pre- 
emptive thread-based Windows. On Win- 
dows, a mutual exclusion semaphore was 
used. In this environment, all NeoAccess 
threads attempt to obtain this semaphore 
before proceeding. However, if another 
thread holds the semaphore, then other 
threads will block while waiting for the 
holding thread to yield. If the thread that 
holds the semaphore is preempted by the 
operating system, only nonNeoAccess 
user-interface threads will run. The code 
that implements these constructs is part of 
the implementation of the platform-specific 
thread class. As a result, client code is un- 
aware of these details. 


Interface 
The CNeoThread class provides an 
environment-neutral interface to multi- 
threading services. This base class is further 
subclassed to provide platform-specific im- 
plementations of the abstract interface and 
to deal with additional services not avail- 
able on all platforms. Following the inter- 
face of the abstract base class and the ser- 
vices provided by the underlying operating 
system, platform-specific subclasses can 
be easily written. 





43 





Figure 1; How the NeoThread classes fit together. 


Listing One shows how the platform- 
specific objects are hidden using typedefs. 
CNeoThreadBase is the typedef used to re- 
fer to the underlying base class of the 
CNeoThread abstract base class. When 
NeoAccess is built for use with MFC, 
CNeoThreadBase is defined to be CWin- 
Thread. On the Macintosh using the 
PowerPlant application framework, CNeo- 
ThreadBase is defined to be the base class 
of all PowerPlant threads, L7hread. Either 
typedefs or #defines can be used to cre- 
ate such a class name mapping. 

The subclass providing an environment- 
specific implementation to the CNeoThread 
implementation differs depending on the 
target run-time environment. Under Win- 
dows, this class is CNeoThreadMFC. On the 
Macintosh it is CNeoThreadPP. The imple- 
mentation of these subclasses provides the 
environment-specific support of each run- 
time environment. The symbol CNeoThread- 
Native is the platform-independent name 
used to refer to the appropriate platform- 
specific subclass. Figure 1 shows how the 
classes fit together. 

Listing Two is the platform-independent 
CNeoThread class used in all environments. 
The interface of CNeoThread is identical 
on all platforms; it’s the abstract interface 
that the client code is written to. The ac- 
tual code defining the CNeoThread class 
has been abbreviated for simplicity. The 
CNeoThreadBase class is used to define the 
base class of CNeoThread. As in Listing 
One, the definition of CNeoThreadBase dif- 
fers depending on the target platform. 

It's worth noting that the interface to 
CNeoThread is sufficient, but not exhaus- 
tive. The interface is designed to judiciously 
include all the functions a multithreaded 
application requires. Adding more func- 
tions than necessary risks increasing the 
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(continued from page 44) 
effort necessary to support other run-time 
environments in the future. 

Also, the features of this interface are 
not limited to those that are available on 
all supported platforms. For example, the 
block function includes an argument that 
can be used to specify how long the 
thread is willing to block waiting before 
it times out. The possible values are 
kNeoNever, kNeoForever, or some other 
value indicating the number of millisec- 
onds the thread is willing to wait. Yet not 
every platform supports the ability to time- 
out while waiting for a resource. The de- 
scription of this function stipulates that all 
platforms support RNeoNever and RNeo- 
Forever, and that those platforms that don’t 
support a specific amount of time assume 
kNeoForever if any value other than RNeo- 
Never is given. 

Finally, note the aTime argument has a 
default value so that the client is free to 
ignore the timeout feature completely. This 
further minimizes visible complexity. 


Platform-Specific Implementations 

The CNeoThreadBase and CNeoThread- 
Native typedefs defined in Listing One re- 
fer to platform-specific thread classes. CNeo- 
ThreadBase refets to the base class of the 
platform-independent CNeoThread class. 
CNeoThreadNative defines the platform- 
specific thread class that implements the 
CNeoThread interface. These types pro- 
vide platform-neutral class names that can 
be used in all environments. 

For Windows, the symbol CNeoThread- 
Native refers to CNeoThreadMFC in List- 
ing Three. This is a platform-specific class, 
designed for use with Windows and MFC. 
Note the declaration of the gNeoCritical 
critical section semaphore that precedes 
the definition of CNeoThreadMFC. Also 
note references to this semaphore in the 
implementations of some of the inline 
functions of CNeoThreadMFC. 

When you look at the code in Listings 
Three and Four, you'll notice that many 
of the static function prototypes in both 
CNeoThreadMFC and CNeoThreadPP are 
identical. All platform-specific subclasses 
include the same set of static functions 
with identical calling conventions and can 
always be referred to using the CNeo- 
ThreadNative typedef. This results in a con- 
struct which is sometimes called “static vir- 


tual functions.” This idiom is an extension 
of the idea of an abstract base class that 
provides a generic interface to which client 
code can be written. The CNeoThread- 
Native symbol extends the interface into 
the subclass by using the member func- 
tions, both static and otherwise, with a 
common interface across all platforms. 
While the prototypes of these functions 
are identical, their platform-specific im- 
plementations may differ. 


Cross-platform 
development is not a 
niche specialty 





All of the platform-specific code has 
been isolated in the conditional declara- 
tion, so that the software utilizing the ob- 
jects (as well as the balance of the ab- 
straction layer) need not know exactly 
what code is executing, or on what plat- 
form. The conditional declaration in List- 
ing One took care of that, mapping the 
platform-specific classes to the standard 
CNeoThreadBase and CNeoThreadNative 
symbols. If additional implementations are 
to be added, new platform-specific class- 
es are created. The conditional declaration 
is then expanded to test for the new plat- 
forms, and select the appropriate platform- 
specific classes. 

It is important to note that, once the 
platform-specific classes are written, the 
multithreading specifics of each platform 
can be virtually forgotten. The developer 
of the class need only keep the concepts 
of how Macintosh or Windows threading 
works in his head for the duration of the 
development of the platform-specific 
thread class. 


Using these Classes 
Listing Five shows how simple life is now, 
with the thread objects completely ab- 


stracted away from the platform. Regard- 
less of what platform is used, the code 
acts the same way— only the underlying 
classes change, and that process is han- 
dled automatically at compile time. 

Programs invariably change. In the typ- 
ical business application, a program is de- 
signed to solve a particular business prob- 
lem. As time goes by, the nature of the 
problem changes, and the business must 
change as well. Consequently, the busi- 
ness application must adapt to fit the new 
problem. For a long time now, we’ve seen 
the advantages of object-oriented pro- 
gramming for supporting the process of 
evolving applications to suit new business 
needs. Nowhere is this technique more 
useful than in a cross-platform imple- 
mentation. 

With the separation of platform-specific 
code from the application code, develop- 
ers can more readily evolve an application 
as business requirements change. They can 
ignore platform code and focus only on 
code pertaining to the business. Revisions 
to the application need to occur only once. 
The code is then compiled for each plat- 
form. In this process, platform-specific code 
is unaffected. 

Should a new platform be introduced 
for the application, the plan for imple- 
menting the application on the new plat- 
form is instantly and abundantly clear— 
each of the platform-specific classes must 
be written for the new platform. The 
specifications of these objects are already 
clearly delineated in the existing appli- 
cation— you just have to sit down and 
write it. These platform-specific classes 
will represent a relatively small portion 
of the application’s source tree— the 
bulk of it will instantly convert to the 
new platform, since there is nothing in 
the application objects that is unique to 
any platform. 

Ultimately, the goal of effective cross- 
platform development is to write as much 
platform-independent code as possible. 
Platform-specific classes isolate the vari- 
ances in behavior between the platforms, 
while still providing robust features to the 
platform-independent software. This is a 
design philosophy, not a language feature. 
You have to design your applications from 
the beginning to be platform independent. 


DDJ 





Listing One 


#if defined (WINDOWS) 


typedef CWinThread CNeoThreadBase; // MFC's thread class 
// The base class of all application threads is Neo's MFC thread class 


typedef CNeoThreadMFC CNeoThreadNative; 
#elif defined (macintosh) 


typedef LThread CNeoThreadBase; // PowerPlant's thread class 


Listing Two 


class CNeoThread 
public: 


CNeoThread(void **aArg, 


: public CNeoThreadBase { 


const NeoThreadOptions aOptions, 
const NeoPriority aPriority) ; 


// The base class of all application threads is Neo's PP thread class 


typedef CNeoThreadPP CNeoThreadNative; 
ftendif 
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virtual ~CNeoThread (void) ; 


virtual void block(CNeoSemaphoreNative *aSemaphore, const long aParam, 
const NeoTime aTime = kNeoForever) = 9; 


(continued on page 48) 
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(continued from page 40) 


virtual NeoThreadState getState(void) const = @; 


virtual long run(void) ; 


virtual void setState(const NeoThreadState aState, 
const NeoThreadID aNext = kNeoNoThread) = @; 


void suspend(void) = @; 


virtual void unblock(CNeoSemaphoreNative *aSemaphore) = @; 
virtual void yield(CNeoThread *aTo = nil) = @; 
protected: 


NeoThreadOptions fOptions; 
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Listing Three 


extern CRITICAL_SECTION gNeoCritical; 


class CNeoThreadMFC : 


public: 
CNeoThreadMFC (NeoThreadOptions aOptions, 
const NeoPriority aPriority, 
NeoUserThreadFunc aUserFunc) ; 
virtual OSErr block(CNeoSemaphoreNative *aSemaphore, 
const long aParam, NeoTime aTime = kNeoForever) ; 
virtual NeoThreadState getState(void) const {return fState;} 


ee 


void resume(void) { 


fState = kNeoThreadReadyState; 


ResumeThread () ; 


} 


public CNeoThread { 


virtual void setState(const NeoThreadState aState, 
const NeoThreadID aNext = kNeoNoThread) ; 
virtual void sleep(unsigned long aTime) ; 


void suspend (void) ; 


virtual OSErr unblock(CNeoSemaphoreNative *aSemaphore) ; 


virtual void yield(CNeoThread *aTo 


nil); 


static void BeginCriticalSection(void) (} 


static void EndCriticalSection(void) {} 


static void InitThreads(void) { 


::InitializeCriticalSection(&gNeoCritical) ; 


virtual void Sleep(unsigned long aTime) {sleep(aTime) ;} 


static void YieldTo(CNeoThread *aTo 


} 


protected: 


NeoThreadState fState; 
}; 


Listing Four 


class CNeoThreadPP : 


public CNeoThread { 


nil) { 
GetCurrent ()->yield(aTo) ; 
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public: 
/** Instance Member Functions **/ 
CNeoThreadPP(void **aArg, const NeoThreadOptions aOptions, 
const NeoPriority aPriority) ; 
/** Access Member Functions **/ 


virtual OSErr block(CNeoSemaphoreNative *aSemaphore, const long aParam, 
const NeoTime aTime = kNeoForever) ; 
virtual NeoThreadState getState(void) const; 
virtual void setState(const NeoThreadState aState, 
NeoThreadID aNext = kNoThreadID) ; 
void suspend(void) {Suspend() ;} 
virtual OSErr unblock(CNeoSemaphoreNative *aSemaphore) ; 
virtual void yield(CNeoThread *aTo = nil) { 
CNeoThread: :Yield(aTo) ; 
} 
/** Static Member Functions **/ 
static void BeginCriticalSection(void) {EnterCritical() ;} 
static void EndCriticalSection(void) {ExitCritical() ;} 
static CNeoThreadPP * GetCurrent(void) { 
return (CNeoThreadPP *)GetCurrentThread(); 
} 
static void InitThreads (void) ; 
static void YieldTo(CNeoThread *aTo = nil) { 
CNeoThread: : Yield(aTo) ; 
} 
protected: 
/** Macintosh-Specific Member Functions **/ 
static void * GetTaskRef(void) {return sThreadTaskRef;} 
static void I0Complete(ParmBlkPtr pbPtr); 
static void AsynclOResume(CNeoThreadPP *aThread) ; 
void setI0CompleteProc(NeoThreadBlock *aBlock, 
NeoCompletionProc aProc = nil); 
OSErr waitUntillOCompletes(NeoThreadBlock *aThreadBlock, 
OSErr & volatile aError); 
virtual void setEpilogue(NeoThreadEpilogue aEpilogue, void *aParam) ; 


Listing Five 


class CMyThread : 
public: 
CMyThread(void **aArg = nil, 
const NeoThreadOptions aOptions = kCreateIfNeeded, 
const NeoPriority aPriority = kNeoPriorityNormal) ; 
virtual ~CMyThread (void) ; 
/** Access Member Functions **/ 
virtual long run(void) ; 


}3 


public CNeoThreadNative { 
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A DNA Sequence 


Class in Perl 





Using Perl’s object- 
oriented and text- 
manipulation features 





Lincoln Stein 


he Human Genome Project is a multi- 

national project to determine the en- 

tire human DNA sequence by the 

year 2003. Obtaining this information 
means pushing around massive amounts 
of information— estimates quickly run 
into the terabytes. This in turn requires 
sophisticated software engineering, fault- 
tolerant information systems, and rapid 
application development. 

In this article, I describe a Perl library 
for manipulating DNA and RNA sequences. 
In the course of examining this library, 
youll see how Perl’s object-oriented fea- 
tures work together to create an elegant 
API. And hopefully, you'll learn a little bi- 
ology as well. 


DNA, RNA, and Proteins 

The stuff of the genome is deoxyribonu- 
cleic acid (DNA), a long thin molecule that 
is usually compactly coiled into the chro- 
mosomes of our cells. DNA consists of four 
distinct subunits, called “nucleotide bases,” 


Lincoln develops databases, applications, 
and user interfaces for the Human 
Genome Project at Cold Spring Harbor 
Laboratory in Long Island, NY. His books 
on web software development include The 
Official Guide to CGI.pm (John Wiley & 
Sons, 1998) and Writing Apache Modules 
in Perl and C (O'Reilly & Associates, 1999). 
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which are repeated across its entire length. 
The four bases have been assigned the con- 
venient single-letter names A, G, C, T— ab- 
breviations for their longer chemical names. 

In DNA, the nucleotide bases are linked 
together into long chains that can be writ- 
ten down as an ASCII string. Figure 1(a), 
for example, is a DNA sequence consist- 
ing of 39 nucleotides. 


DNA doesn’t usually float around the 
cell in its single-stranded form. Instead 
it spends its time in a stable double- 
stranded form, the famous “double he- 
lix.” In the double-stranded form, each 
nucleotide is paired with another nu- 
cleotide. Because of their chemical na- 
ture, A always pairs with T, and G pairs 
with C. Written down in a text represen- 
tation, the double-stranded form of this 
short sequence looks like Figure 1(b). 

Because the nucleotide bases are paired, 
they are often referred to as base pairs 
(bp). Pve labeled the left end of the top 
strand 5’ and the right end 3’. On the bot- 
tom strand, the numbering of the two ends 
is reversed. This numbering system is re- 


lated to the way that DNA is put togeth- 
er chemically. Here, the only significance 
of this is that it emphasizes that DNA 
strands are directional. The two strands 
are often arbitrarily labeled the “plus” and 
“minus” strands to distinguish them. 

DNA can do just two things: It can repli- 
cate, and it can be transcribed into RNA. 
The replication process is the key to both 
cell replication and to propagation of the 
species. The two strands of DNA unwind 
like a zipper, and each strand dictates the 
assembly of its complementary second 
strand. Schematically, the process looks 
like Figure 1(c). 

More interesting is the transcription and 
translation process. Along its length, DNA 
encodes the instructions for many thou- 
sands of proteins, everything from the crys- 
talline protein of the eye lens to the en- 
zymes that make up the digestive juices 
of the gastrointestinal tract. These protein 
coding regions, separated from each oth- 
er by large tracts of DNA of unknown 
function, are in fact genes. 

To make a protein from the DNA se- 
quence of a gene, the cell performs two 
phases of chemical transformation. In the 
first phase, the gene is transcribed into ri- 
bonucleic acid (RNA). RNA is like DNA 
in many ways, but instead of being dou- 
ble-stranded it usually exists in single- 
stranded form. In addition, instead of be- 
ing composed of the four bases A, G, C, 
and T, RNA has no T, but uses a differ- 
ent nucleotide abbreviated as U. 

To transcribe RNA, DNA unwinds just 
a bit in the region of an activated gene, 
and the nucleotide sequence of the DNA 
is read off by enzymes that synthesize an 
RNA copy of the gene. Sometimes the plus 
DNA strand is transcribed, and sometimes 
the minus strand, depending on whether 
the gene is oriented right-to-left or left- 
to- right. 
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(continued from page 50) 

Represented in text form, an RNA strand 
looks just like its parent DNA strand except 
for the substitution of U for every T. Figure 
2(a) depicts our example DNA in RNA form. 

Unlike DNA (which never leaves the nu- 
cleus of the cell), RNA is free to travel 
through the nuclear envelope into the cel- 
lular cytoplasm. Once there, the RNA is 
translated into a protein. Like RNA and DNA, 
proteins are also long strands of repeating 
units. However, instead of there being only 
four units, proteins are made up of 21 dif- 
ferent “amino acid” subunits. Proteins fold 
into complex structures dictated by the or- 
der of their amino acids. The folding deter- 
mines the protein’s structure and function. 

Like the nucleotide bases, biologists use 
one-letter abbreviations to refer to the 
amino acids as well. Protein sequences use 
the letters A, C, D, E, FE G, H, I, K, L, M, 
N, P, Q, R, S, T, V, W, and Y. Because there 
just aren’t enough letters in the Latin al- 
phabet to go around, the protein alpha- 
bet overlaps with the nucleotide alphabet, 
but don’t let that confuse you. An A found 
in a nucleic acid sequence has nothing to 
do with the A of a protein sequence. 

Because only four RNA bases must dic- 
tate the order of 20 amino acids, there is ob- 
viously more to protein translation than the 
simple one-to-one encoding that takes place 
sane fanscnption. In fact, the at aes trans- 


— 
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lation machinery uses a three-letter code to 
translate RNA into protein. During transla- 
tion, the RNA is divided into groups of three- 
letter “codons,” as in Figure 2(b). 

The codons are used as a template to 
synthesize a protein sequence, using a lit- 
tle lookup table that’s hardwired into the 
biological machinery. AUG becomes the 
amino acid M, UUC becomes F, CGA be- 
comes R, and so forth. Our example DNA 
is translated into a 12 amino acid protein 
in Figure 2(c). 

There are two things to notice in this 
example. One is that certain amino acids 
are encoded by several different codons. 
For example, the amino acid K is en- 
coded by both AAA and AAG. This 
should be expected from the fact that 
there are 64 possible codons, and only 
20 amino acids for them to encode. The 
other thing to notice is that certain 
codons (three in all) don’t encode any 
amino acids. Instead they are “stop 
codons,” which tell the translation ma- 
chinery to stop translating and release 
the finished protein. Generally, the RNA 
molecule extends farther to the left and 
right than the protein it encodes (I’ve 
glossed over this fact for simplicity of il- 
lustration). Like the stop codons, the 
AUG codon is special because it tells the 
protein translation machinery with which 
codon to begin. 


| ATGTTCCGAAAATCCCCGATTTGGACTAAGCCTGTGTGA 


(b) 


= ATGTTCCGAAAATCCCCGATTTGGACTAAGCCTGTGTGA 3 


3’ TACAAGGCTTT TAGGGGCTAAACCTGATTCGGACACACT 5 


(c) 


c : 
8 = - 
5’ ATGTTCCGAAAAT 





- ATTTGGACTAAGCCTGTGTGA 3’ 
__GTAAACCTGATTCGGACACACT 5° 


c ATTTGGACTAAGCCTGTGTGA as 
-CTAAACCTGATTCGGACACACT 5’ 





Figure 1: (a) DNA sequence consisting of 39 nucleotides; (b) double- stranded 
form of the DNA sequence; (c) DNA replication process. 


(a) 


5 AUGUUCCGAAAAUCCCCGAUUUGGACUAAGCCUGUGUGA 3 


(b) 


5' AUG UUC CGA AAA UCC CCG AUU UGG ACU AAG CCU GUG UGA 3 


(c) 
RNA 


Protein Mm FOG K = FP 


5' AUG UUC CGA AAA UCC CCG AUU UGG ACU AAG ccu GUG UGA 3' 


| Ww kK P Vs 





Figure 2: (a) Sequence in RNA form; (b) RNA sequence divided into codons; (c) 


RNA translated into protein. 
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A Sequence Class Library for Per! 

A lot of Genome informatics involves splic- 
ing, dicing, and processing long strings of 
DNA sequences. I created a library of Perl 
routines specialized for dealing with DNA 
(available electronically; see “Resource 
Center,” page 5), with a small class hier- 
archy like Figure 3. 

Sequence::Generic is an abstract class 
that implements a few generic methods 
that all biological sequences share, such 
as a method for determining the se- 
quence’s length and a method for con- 
catenating two sequences together. Se- 
quence::Nucleotide is a subclass of 
Sequence::Generic that adds support for 
DNA- and RNA-specific operations. One 
of these new operations is the reverse 
complementation method, which trans- 
forms one strand of DNA into its com- 
plement; another is a method to translate 
RNA into protein. 

Sequence:: Nucleotide: :Subsequence is 
a descendent of Sequence::Nucleotide. 
Because the chunks of DNA that need to 
be analyzed are usually quite long 
(100,000 bp is not unusual), it’s typical 
to work with one subregion at a time. A 
Subsequence represents a subregion of a 
longer sequence. 

The Sequence.:Alignment class is a util- 
ity class that stores information about how 
two similar sequences are related. It is use- 
ful for figuring out how a smaller sequence 
fits into a larger one. 

For completeness, there should also be 
a Sequence::Protein class descended from 
Sequence::Generic, but that was too much 
to squeeze into this article. Instead of re- 
turning a real Sequence::Protein object, 
the method that translates RNA into pro- 
teins just returns a simple character string. 


The Sequence::Generic Class 
Sequence::Generic (Listing One) defines 
three methods that are intended to be 
overridden by child classes: new(), seqQ, 
and type(). The new() method is the ob- 
ject constructor. It does nothing but call 
the croak() function from the Carp pack- 
age to abort the program with an error 
message. This prevents the generic class 
from being instantiated. The seg() method 
is a low-level routine that returns the raw 
sequence information as a text string. This 
method also croaks in case. Sequence 
Generic is subclassed without the seq() 
method being overridden. The type() 
method returns a human-readable string 
(continued on page 50) 


sub Be { 


my $self = shift: 


return length ($self->seq) ;_ 





Example 1: The \ength method. 
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(continued from page 52) 
describing the type of the sequence, and 
is intended for debugging work. It’s in- 
tended to return something like DNA, 
RNA, or “Protein.” In the abstract class, 
this method returns “Generic Sequence.” 

The remainder of the methods are 
generic ones that will work with almost 
any biological sequence. One of these is 
length(), which returns the length of the 
sequence data; see Example 1. By con- 
vention, Perl methods are invoked with a 
reference to the object as the first argu- 
ment on the subroutine argument list. This 
method begins with the idiom my $self = 
shift. The effect of this statement is to shift 
the object off the argument list and to copy 
it into a local variable named $self. The 
methods then invoke our object’s seq() 
method with the Perl method-invocation 
syntax $self->seq and pass it to the Perl 
string-length function /ength() (this is a 
normal function call, not a method call). 
The result is then returned to the caller. 

Another method defined in this file, con- 
catenate( ) (see Example 2), concatenates 
two Sequence::Generic objects together or 
concatenates a Sequence::Generic object 
with a string, returning a new sequence 
object as the result. 

In addition to its object reference, the 
method takes two arguments. The first is 
the new sequence to concatenate to the 
current one. The second argument is a flag 






Example 3: Overloading operators in 
the Sequence::Generic class. 
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Example 2: The concatenate() method. 






Example 4: Using the concatenate operator. 


that indicates whether the new sequence 
is to be prepended (true) or appended 
(false). concatenate() is usually called via 
operator overloading, and the Perl over- 


load machinery actually takes care of set- 


ting up the two arguments. 

The method first checks whether the new 
sequence is an object by calling the Perl 
built-in ref(), which returns the class name 
for objects, and the undefined value for 
nonobjects. If vef() indicates that the new 
sequence is an object, concatenate() checks 
whether it is a subclass of Sequence.:Gener- 
ic by using the built-in isa() method. The 
__PACKAGE__ token is replaced by the 
Perl run time with the name of the current 
package, and avoids having to hardcode 
the name of the class. If the object is not a 
subclass, the routine aborts with an error 
message. Otherwise, it recovers the se- 
quence as a string by calling its seg() 
method. If the $7ew_seq argument isn’t an 
object at all, the method treats it as a string. 

The last statement of this method uses 
the Perl built-in concatenation operator 
“.” to combine the sequence strings to- 
gether in the order dictated by the 
$prepend flag. The concatenated string is 
passed to the object’s new() constructor 
to create a new Sequence object, which 
is returned to the caller. Because con- 
catenate() will be called from a subclass 
of Sequence.:Generic, the new() con- 
structor that gets called will belong to the 
subclass, not to Sequence::Generic. In Perl 
there is no strong distinction between con- 
structors and object methods, which may 
be a source of confusion for C++ and Java 
programmers. 

Perl lets you overload many of its built- 
in operators so that when they are applied 
to objects they invoke a method call rather 
than take their default actions. I overload 
three different operators in the Se- 


quence::Generic class (Example 3). For 
example, by binding the “.” operator to 
concatenate( ), each of the constructions 


in Example 4 will work in the natural way. 


The Sequence::Nucleotide Class 
Sequence::Nucleotide (Listing Two) is a 
dual-purpose class that represents both 
DNA and RNA. Because DNA can be 
transformed into RNA and vice versa sim- 
ply by exchanging Ts and Us, I store the 
data as DNA and transform it into an RNA 
form on demand. 

This module begins by loading the oth- 
er modules it depends on, including Se- 
quence::Generic, Sequence:: Nucleotide: :Sub- 
sequence, Sequence::Alignment, and Carp. 

One difference between Sequence::Nu- 
cleotide and Sequence::Generic is its use 
of the @/SA global. The @ISA array con- 
tains a list of all the classes that the cur- 
rent one inherits from. Unlike Java, Perl’s 
object system allows for multiple inheri- 
tance, although this feature is rarely need- 
ed. In this case, @/SA is a one-element list 
containing the name of the superclass, Se- 
quence::Generic. 

The next line in Listing Two defines a 
private package variable named %CODON 
_TABLE. It is a Perl hash table (associa- 
tive array) that maps the 64 RNA codons 
to the 20 amino-acid protein alphabet. 

The first method this class defines is the 
new() constructor, defined in Example 5. 
new() creates a hash array that contains 
the key’s data and type. The data key will 
point to the raw nucleotide string data, 
canonicalized into upper-case DNA form. 
The “type” key is either DNA or RNA, in- 
dicating whether the sequence is to be 
displayed in DNA or RNA form. 

The new() method can be called in sev- 
eral different contexts. It can be called “de 
novo” as a class constructor with a plain 
string argument to be interpreted as se- 
quence data, as in Example 6(a). Alterna- 
tively, the argument might be another Se- 
quence object (either a Sequence:.:Nucleotide 
or another subclass of Sequence::Generic), 
in which case the constructor should re- 
turn a clone of the original sequence, as in 
Example 6(b). A final context in which the 
constructor might be called is one in which 
new() is used as an object method call 
rather than as a constructor. In this case, 
we want to return a new object of the same 
subclass as the object, as in Example 6(c). 
Finally, the new() method takes an op- 
tional second argument that can be used 
to force the sequence type. If the second 
argument is omitted, the method will guess 
whether the sequence is RNA or DNA by 
looking for the presence of “U-base pairs,” 
as in Example 6(d). 

When a method is called in the class con- 
structor style (as in the first two examples 
above), the first argument passed to the 
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(continued from page 50) 

function is a string containing the class 
name. When a method is called as an ob- 
ject method (as in the third example), the 
argument is a reference to the object. The 
first thing this method does is to recover 
the class name from the object reference 
in the event that the argument is a refer- 
ence rather than an ordinary string. This 
ensures that both class constructor and 
method call styles work properly. 

The method recovers the other argu- 
ments from the subroutine argument list, 
storing them in local variables $sequence 
and $type. It also initializes an empty 
hash reference using the anonymous 
hash constructor {}, and uses the bless 
operator to associate this reference with 
the current class, turning it into an ob- 
ject reference. 

new ) then examines the $sequence ar- 
gument. If it is an object reference, the 
method determines whether the object im- 
plements the seqg() method by invoking 
the built-in can() method. The decision 
to use can() here rather than isa(), as I 


‘*{Sself) = 
dD else ( 
— eroak "Doesn't 1 


_ Ssequence! - 
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Example 6: Calling the new method. 
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did in Sequence.:Generic, was an arbitrary 
choice, motivated only by the desire to 
show something new. If the object doesn’t 
implement seg(), it isn’t likely to be a Se- 
quence, so it croak()s with an error mes- 
sage. Otherwise, it clones the object with 
the simple expedient operation of copy- 
ing its entire hash table. 

If the Ssequence argument is not an 
object reference, the method treats it as 
a string. new() does a quick check to 
see if it looks like a nucleic acid by 
matching it against a regular expression 
containing the characters GATC and U. I 
also allow whitespace to match (charac- 
ter class \s) and the N character, com- 
monly used in experimental data to in- 
dicate an unknown or ambiguous base. 
If the sequence passes this test, new() 
passes it to a private routine named 
_canonicalize() to fix case, to strip out 
whitespace, and convert the sequence 
into DNA form, if necessary. The canon- 
icalized sequence is then stored in the 
data field of our object’s hash reference. 
new) also sets Hie ore field to contain 


equence object. \n" 


jucleotide('gatee 








either the value provided by the caller, 
or, if not provided, to a guess based on 
the nucleotide composition of the pro- 
vided sequence. new() returns the new 
object as the function result. 

The translate() method takes the cur- 
rent sequence and translates it into pro- 
tein sequence. It often happens with nov- 
el DNA sequences that you don’t know 
in advance where the gene or genes ac- 
tually begin. The part of the gene that 
encodes the protein may start at any of 
three offsets along the strand, and may 
be read either from the plus or the mi- 
nus strand, giving a total of six possible 
“reading frames” that the protein may be 
read from. 

The translate() method accepts an op- 
tional frame number argument, which can 
be any of the integers 1, 2, 3, or -1, —2, —3, 
and returns the protein translation for that 
reading frame. If no argument is provided, 
the routine returns the translation begin- 
ning from the first nucleotide in the se- 
quence, reading frame +1. After a bit of ad- 
justment to trim the sequence to an even 
multiple of three, the core of the transla- 
tion routine is: 


$s=~s/(\S{3})/SCODON_TABLEI$1} || ‘X’/eg; 


This is a Perl global pattern match and 
substitution operation. It finds codons by 
identifying groups of exactly three non- 
whitespace characters \S/3/ and replaces 
them with the amino acid value looked 
up in %CODON_TABLE. If, for some rea- 
son, the codon isn’t present in the table 
(perhaps because of an ambiguous “N” in 
the sequence), an X is used for the cor- 
responding amino acid residue. The trans- 
lated sequence is then returned as the 
function result. 


Conclusion 

Perl has met the needs of the Genome Pro- 
ject admirably so far and will probably 
continue to do so for years to come. In 
this article, I’ve tried to give you a taste of 
how Perl can reach beyond its “quick and 
dirty” heritage to build a set of object- 
oriented classes. These classes can, in turn, 
serve as the foundation for large and com- 
plex software projects. 

If you are interested in learning more 
about the use of Perl in biology, check 
out the Bioperl Project at http://bio 
.perl.org/. This cooperative project is cre- 
ating an extensive class library of biolog- 
ically important objects. Here you'll find 
full-featured cousins of the simple nu- 
cleotide sequence classes presented here, 
as well as Perl classes for proteins, genes, 
genetic maps, phylogenetic trees, and 3D 
protein structures. 


DDJ 
(Listings begin on page 60.) 
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Listing One 


package Sequence: :Generic; 
# File: Sequence/Generic.pm 


use strict; 

use Carp; 

use overload 
tree => ‘asString' ; 
‘neg! => 'reverse', 
re => 'concatenate', 
'fallback' => 'TRUE'; 


# These methods should be overriden by child classes 
# class constructor 
sub new { 

my Sclass = shift; 

croak "Sclass must override the new() method"; 


J 
# Return the sequence as a string 
sub seq { 
my $self = shift; 
croak ref($self)," must override the seq() method"; 
} 
# Return the type of the sequence as a human readable string 
sub type { 
return 'Generic Sequence'; 
} 


# These methods probably don't have to be overridden 
# The length of the sequence 
sub length { 
my $self = shift; 
return length($self->seq) ; 
} 
# The reverse of the sequence 
sub reverse { 
my $self = shift; 
my Sreversed = reverse Sself->seq; 
return Sreversed; 
} 
# A human-readable description of the object 
sub asString { 
my Sself = shift; 
return $self->type . '(' . $self->length . ' residues)'; 
} 


# Concatenate two sequences together and return the result 


sub concatenate { 

my Sself = shift; 

my ($new_seq,Sprepend) = @_; 

my (S$to_append) ; 

if (ref(Snew_seq)) { 
croak "argument to concatenate must be a string or a Sequence object" 
unless $new_seq->isa(__PACKAGE__) ; 
Sto_append = S$new_seq->seq ; 

} else { 
Sto_append = Snew_seq; 


return $self->new(Sprepend ? S$to_append . $self->seq 
Sself->seq . $to_append); 


Listing Two 


package Sequence: :Nucleotide; 
# file: Sequence/Nucleotide. pm 


use Sequence: :Generic; 

use Sequence: :Nucleotide: :Subsequence; 
use Sequence: :Alignment; 

use Carp; 


use strict; 
use vars '@ISA'; 
@ISA = 'Sequence::Generic'; 


my %CODON_TABLE = ( 


UCA => 'S',UCG => 'S',UCC => 'S',UCU => 'S', 
UUU => 'F',UUC => 'F',UUA => 'L',UUG => 'L', 
UAU => 'Y',UAC => 'Y',UAA => '*',UAG => '*', 
UGU => 'C',UGC => 'C',UGA => '*',UGG => 'W', 
CUA => 'L',CUG => 'L',CUC => 'L',CUU => 'L', 
CCA => 'P',CCG => 'P',CCC => 'P',CCU => 'P', 
CAU => 'H',CAC => 'H',CAA => 'Q',CAG => 'Q', 
CGA => 'R',CGG => 'R',CGC => 'R',CGU => 'R', 
AUU => 'I',AUC => 'I',AUA => 'I',AUG => 'M', 
ACA => 'T',ACG => 'T',ACC => 'T',ACU => 'T', 
AAU => 'N',AAC => 'N',AAA => 'K',AAG => 'K', 
AGU => 'S',AGC => 'S',AGA => 'R',AGG => 'R', 
GUA => 'V',GUG => 'V',GUC => 'V',GUU => 'V', 
GCA => 'A',GCG => 'A',GCC => 'A',GCU => 'A', 
GAU => 'D',GAC => 'D',GAA => 'E',GAG => 'E', 
GGA => 'G',GGG => 'G',GGC => 'G',GGU => 'G', 


DY 


*complement = *reversec = \&reverse; 


sub new { 
my Sclass = shift; 
Sclass = ref(Sclass) if ref(Sclass); 
my (Ssequence,Stype) = @_; 


my Sself = bless {},Sclass; 
if (ref($sequence)) { 
croak "Can't initialize sequence from non-Sequence object. \n" 
unless $sequence->can('seq'); 
{$self} = %{$sequence}; # clone operation 
} else { 
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croak "Doesn't look like sequence data" 
unless $sequence=~/*[gactnu\s]+$/i; 
Sself->{'data'} = $self->_canonicalize($sequence) ; 
Sself->{'type'} = $type || ($sequence=~/u/i ? 'RNA' : 'DNA'); 
} 


return Sself; 


} 
sub seq { 
my $self = shift; 
Sself->{'data'} = $self->_canonicalize($_[@]) if defined($_[@]); 
my $seq = $self->{'data'}; 
return $seq unless $self->is_RNA; 
$seq=~tr/T/U/; 
return $seq; 
} 
sub type { 
my $self = shift; 
return defined($_[@]) ? $self->{'type'} = $_[@] : $self->{'type'}; 
} 
sub is_DNA { 
my $self = shift; 
return $self->type eq 'DNA'; 
} 
sub is_RNA { 


my $self = shift; 
return $self->type eq 'RNA'; 
} 
sub subseq { 
my $self = shift; 
my ($start,$end) = @; 
return (__PACKAGE__ . '::Subsequence')->new(Sself,$start,S$end) ; 
J 
sub reverse { 
my $self = shift; 
return (__PACKAGE__ . '::Subsequence')->new($self,$self->length,1); 
} 
sub translate { 
my $self = shift; 
my $frame = shift() 1; 1; 
my $1 = $self->length; 
my $seq = $frame > @ ? $self->subseq($frame,$1-($1-$framet+1)%3) 
: $self->reverse->subseq (abs ($frame) ,$1-($1-abs($frame)+1)%3) ; 
my $s = S$seq->seq; 
$s=~tr/T/U/; # put it in RNA mode 
$s =~ s/(\S{3})/SCODON_TABLE{$1} || 'X'/eg; 
return $s; 
} 
sub longest_orf { 
my $self = shift; 


my ($max,$pos,$frame) ; 
foreach (-3..-1,1..3) { 
my $translation = $self->translate($_); 
while (Stranslation=~/([**]+)/g) { 
if (length($1) > length($max)) { 


Smax = $1; 

Sframe = $_; 

Spos = pos($translation) - length($max) ; 
} 
} 
Spos *= 3; 


Spos += abs($frame) ; 
return (Spos,Spost+3*length($max)-1) if $frame > 9; 
return (S$self->length-Spos,$self->length-S$pos-3*length ($max) ) ; 


} 
sub align { 

my $self = shift; 

my $seq = shift; 

Sseq = Sseq->seq if ref($seq); 

return new Sequence: :Alignment (src=>$seq,target=>$self->seq) ; 
} 


sub _canonicalize { 
my Sself = shift; 
my $seq = shift; 
Sseq =~ tr/uU/tT/; 
Sseq =~ s/[*gatcn]//ig; 
return uc($seq); 


DDJ 
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Designing a 
scripting language 
for extensibility 





John Ousterhout 


ost programming languages are 
designed to be self-contained 
worlds. As a programmer, you 
choose a language, then do all 
your programming in that one language. 
It’s often hard to make code written in 
one language work well with code in an- 
other language, so picking a particular lan- 
guage may prevent you from using other 
languages. 

The Tcl scripting language has a dif- 
ferent design philosophy. Instead of con- 
taining everything you need, Tcl was de- 
signed as an integration language to tie 
together pieces of code written in other 
languages. Tcl works well with almost 
any imaginable language or application, 
and most of the interesting functions you 
use in a Tcl script are implemented out- 
side of Tcl. 

Tcl’s flavor comes in large part from 
the fact that it is extensible. It was de- 
signed from the start to make it as easy 





John is CEO of Scriptics Corp., and cre- 
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64 


in Tcl 


as possible to add to Tcl’s built-in fea- 
tures by writing code in C or other lan- 
guages. As a result, Tcl has been used in 
thousands of different situations to auto- 
mate tasks or integrate disparate re- 
sources. In this article, I'll focus on how 
extensibility works in Tcl. 


Why Extensibility? 
Tcl is used in two common ways, both of 
which require extensibility. 


e As an embedded command language. 
This was my original motivation when 





I created Tcl. The idea was to build the 
Tcl interpreter as a library package that 
could be linked into an application as 
its command language, as shown in 
Figure 1. Tcl provides generic facilities 
that any command language needs, in- 
cluding variables, control structures 





(such as ifand while), procedures, and 
string manipulation. Each application 
then adds its own features into the Tcl 
language as extensions, creating a pow- 
erful command language that can be 
used to automate and extend the ap- 
plication with Tcl scripts. I wanted the 
same base language to be usable for 
almost any application, so Tcl had to 
support as broad a variety of exten- 
sions as possible. Furthermore, exten- 
sions needed to behave naturally, as if 
they had been designed into Tcl from 
the beginning: There shouldn’t be ob- 
vious differences between extensions 
and built-in facilities. 

As a platform for integration applica- 
tions. I did not foresee this usage when 
I created Tcl, but it has become the 
most common way of using Tcl today. 
When used for integration, Tcl is a 
stand-alone platform rather than a 
piece of another application. The ex- 
tension mechanism connects Tcl to re- 
sources being managed, such as ap- 
plications, databases, news feeds, 
devices, or the Web (see Figure 2). Tcl 
scripts can then be used to coordinate 
all the resources and build new func- 
tionality on top of their base features. 
The integration task can be as simple 
as connecting an application to its user 
via a graphical user interface, or as 
complex as the control system for an 
oil-well platform, which manages hun- 
dreds of devices and applications. For 
any language to be good for integra- 
tion, it must connect to a huge variety 
of other resources; Tcl’s extension 
mechanism allows this. 
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Figure 1: When Tcl is embedded in 
an application, it provides basic 
programming facilities that form the 
core of a command language for the 
application. The application then adds 
its own functions into the Tcl 
interpreter as extensions. 


(continued from page 04) 

The bottom line is that extensibility gives 
tremendous power to a scripting language. 
Extensibility makes it possible for Tcl to 
connect to resources and automate func- 
tions that were previously manual. In ad- 
dition, extensibility lets Tcl connect to mul- 
tiple disparate resources and integrate 
them to operate in a coordinated fashion. 


Tel Architecture 

When designing Tcl, I developed the C 
APIs for extension at the same time as the 
language itself, and made deliberate trade- 
offs in the design of the language to sim- 
plify and empower the extension mecha- 
nism. This resulted in an unusual design 
process. The goals that influenced Tcl’s 
architecture include: 


e The core Tcl language should have as 
little structure and flavor as possible. 
Structure implies limitations, so a more 
structured language limits the kinds of 
things that extensions can do. Similar- 
ly, if a language has a strong flavor (such 
as complicated or restricted syntax), it 
will clash with extensions that need a 
different flavor. I wanted Tcl to take on 
the flavor of whatever extensions it is 
used with. 

The language should be extensible in 
as many ways as possible. It should be 
easy to add not only new commands, 
but also new data types and even new 
control structures. 

e The extension mechanism should be as 
simple as possible. 

Extensions should have access to all el- 
ements of the internal state of an inter- 
preter, such as variables. 

Data and code should be represented 
inside Tcl in a way that can easily be 
passed back and forth to extensions 
written in C. This, and the desire for as 
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little structure as possible, led to the use 
of strings for almost everything. 

e The facilities of the core Tcl language 
should be implemented using the same 
mechanisms as extensions. The set of 
things that can only be done inside the 
Tcl core should be as small as possible. 


Given these goals, I decided that in- 
terpreting a Tcl script should be a two- 
phase process. In the first phase, the Tcl 
interpreter parses a section of code, iden- 
tifies an extension to execute it, and pass- 
es control to the extension. In the second 
phase, the extension executes the code. 
Control then returns to the Tcl interpreter 
to parse the next section of code. Ideal- 
ly, the Tcl interpreter should understand 
only the bare minimum needed to parse 
some code and pass control to an exten- 
sion. Everything else in the interpretation 
of the script should be left to the exten- 
sion; this gives maximum power and flex- 
ibility to extensions. 

Inspired by UNIX shells such as sh, I 
decided on a language syntax based on 
commands and words. A Tel script con- 
sists of one or more commands, and each 
command consists of one or more words. 
For example, the command set a 45 sets 
the value of variable a to 43. It has three 


_ words: set, a, and 43. The interpreter pars- 


es the command and breaks it into words. 
It then uses the first word (set) as the name 
of the command, locates a C command 
procedure to execute the command, and 
invokes the command procedure, passing 
it all of the words as arguments. Some 
command procedures, such as the one for 
set, are part of the Tcl interpreter; these 
are called “built-in commands.” Other 
command procedures are part of exten- 
sions. There is no difference between a 
built-in command and an extension ex- 
cept that the command procedures for 
built-in commands are part of the Tcl in- 
terpreter, so they are available in every 
Tcl application. 

In addition to breaking up commands 
into words, the Tcl interpreter performs a 
few other string manipulations before 
passing the words to a command proce- 
dure. Listing One, which illustrates most 
of these features, contains five commands 
separated by newlines. In the second com- 
mand, the § invokes variable substitution: 
The letters after the § are taken as the 
name of a variable, and the value of the 
variable is substituted into the command 
in place of the variable name. Thus the 
command procedure receives 43 as its 
third word, not $a, and variable b is as- 
signed that value. 

The // construct in the third command 
invokes command substitution: Every- 
thing between the brackets is processed 
as a separate command and the result is 


substituted into the outer command. expr 
treats its argument (43+ 10 after the vari- 
able substitution) as an arithmetic ex- 
pression and returns the value of the ex- 
pression, which is 53. This value is 
passed to the set command and assigned 
to variable c. 

The fourth command shows how dou- 
ble quotes can be used to specify words 
containing spaces: Everything between 
the quotes is passed to the command 
procedure as a single word. puts is a 
command that prints its argument; in this 
case it prints the message The value of 
c is 53. If a word is enclosed in curly 
braces (as in the last command), then 
the information between the braces is 
passed to the command procedure ver- 
batim without substitutions. Thus the $ 
is printed by puts and does not cause 
variable substitution to occur. 

The Tcl interpreter knows nothing about 
commands except what is required to 
break them up into words and perform 
the substitutions just described. As far as 
the Tcl interpreter is concerned, all val- 
ues are strings— including commands, 
words, and results. Any further interpre- 
tation of information is carried out by com- 
mand procedures. Thus only the com- 
mand procedure for expr knows that its 
arguments are numbers and operators. 

Control structures such as ifand while 
are just commands that treat their argu- 
ments as Tcl scripts; see Listing Two for 
an example. The command procedure for 
foreach receives four words: foreach, i, 2 
4 68 10, and the Tcl script contained be- 
tween the curly braces. foreach imple- 
ments a loop; for each of the values 2 
through 70, it sets variable 7 to that value 
and then invokes the Tcl interpreter re- 
cursively, passing it the last argument of 
foreach as the script to execute. Only 
the command procedure for foreach 
knows that its third word is actually a 





Figure 2: Tcl can also be used as a 
platform for integration: Extensions 
connect the Tcl interpreter to various 
resources, then Tcl scripts can be 
written to coordinate the resources 
and extend their facilities. 
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(continued from page 60) 

list of values and the fourth word is a nest- 
ed Tcl script. Because the script is enclosed 
in braces, no substitutions occur before it 
is passed to the foreach command pro- 
cedure; however, when the script is passed 
back to the Tcl interpreter for each itera- 
tion of the loop, the braces are no longer 
present so substitutions are done. Tcl pro- 
cedures are created in a similar fashion 
by invoking a command proc that takes 
as its arguments a procedure name, a list 
of arguments, and a Tcl script that is the 
procedure’s body. 

People often ask why Tcl requires the 
use of the set and expr commands, instead 
of traditional assignment statements with 
implicit arithmetic, such as c=a+10. The 
reason is that this would have predefined 


OKAY. 


many features of the language. For ex- 
ample, a command couldn’t have “=” as 
its second word without causing assign- 
ment, and “+” would always invoke ad- 
dition. This would have reduced the pow- 
er of extensions to apply their own 
meanings to their arguments, so it would 
have limited Tcl’s extensibility. 


A Simple Command Procedure 

To create a new Tcl extension, you im- 
plement one or more new commands, 
writing a command procedure for each. 
Traditionally, command procedures have 
been written in C, and that’s what I'll use 
here. However, you can also write com- 
mand procedures in C++ or Java (using 
an extension called “TclBlend” that con- 
nects Tcl to Java; see “TclBlend: Blend- 


| admit it! 
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ing Tcl and Java,” by Scott Stanton, DD/, 
February 1998). Once you’ve written the 
command procedures for your extension, 
you compile them, load them into an ap- 
plication containing Tcl, and register 
them with the Tcl interpreter by telling 
Tcl the name of each command and the 
address of its command procedure. I'll 
skip the details of compiling, loading, 
and registering command procedures to 
focus on the internals of command pro- 
cedures. 

The first example is a new command, 
ddd1, which takes a single integer argu- 
ment. The command adds “1” to its argu- 
ment and returns the result. For example, 
ddd1 12 returns 13. Listing Three is the 
command procedure for add1. 

Once Add1Cmd has been registered as 
the command procedure for add1, the Tcl 
interpreter calls Add1Cmd whenever add1 
is invoked. Command procedures receive 
four arguments. The first argument isn’t 
used in this example; it is used in more 
complex cases to identify an object asso- 
ciated with the command, such as an open 
file or graphical control. The interp argu- 
ment is a handle for the Tcl interpreter 
where the command was invoked. objc 
gives a count of the total number of words 
in the command (including the command 
name), and objv is an array that has ele- 
ments that are the values of the words af- 
ter all substitutions have been performed 
by the Tcl interpreter. objc and obju are 
similar to the argc and argv parameters 
used to pass command-line arguments to 
a UNIX main() function. 

Values are passed around in Tcl using 
structures of type 7cl_Obj. Each word of 
a command is represented with a Tcl_Obj, 
each command returns a 7cl_Obj result, 
each Tcl variable stores its value in a 
Tcl_Obj, and so on. Think of a Tcl_Obj as 
storing a string value of arbitrary length. 
Tcl provides a library of procedures that 
convert the string values in Tcl_Objs 
to/from other forms, such as integers. 
Tcl_Objs also contain information that im- 
proves efficiency by eliminating unneces- 
sary string conversions. 

A command procedure returns two val- 
ues to the Tcl interpreter. The first is a re- 
sult, which is stored in the interpreter and 
accessed via procedures such as Tcl_Set- 
Result or Tcl_SetObjkesult. The second 
value is an integer completion code, 
which is returned as the result of the com- 
mand procedure. A completion code of 
TCL_OK means that the command com- 
pleted successfully. TCL_ERROR means 
that an error occurred while executing 
the command and the script should be 
aborted; in this case the interpreter’s re- 
sult contains an error message to present 
to the user. Other values, such as 
TCL_RETURN and TCL_BREAK, are used 
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to handle returns from Tcl procedures 
and escapes from loops. 

The Add1Cmd procedure first makes 
sure that there were two words in the 
command (the command name and val- 
ue to increment); if not, it calls Tcl_Set- 
Result to store an error message string 
in the interpreter’s result, then it returns 
the TCL_ERROR completion code. If the 
argument count is correct, Add1Cmd re- 
trieves the integer value of the second 
word of the command by calling 
Tcl_GetIntFromObj. This procedure at- 
tempts to translate the string value of the 
argument to an integer. If the operation 
succeeds, it stores the integer value in i 
and returns TCL_OK. If the value can’t be 
converted to an integer (the command was 
add1 dog), then Tcl_GetIntFromObj stores 
an error message in interp’s result and re- 
turns TCL_ERROR. When Add1Cmd sees 
the error return, it returns an error to its 
caller. This style is used commonly 
throughout Tcl: Procedures use TCL_OK 
and TCL_ERROR return values to indi- 
cate whether they succeeded; if errors 
occur, they store error messages in the 
interpreter’s result before returning 
TCL_ERROR. Once one procedure re- 
turns TCL_ERROR, its caller also returns 
TCL_ERROR until control returns to Tcl, 
which then aborts the script and displays 
the error message to users. 

If the integer value is converted suc- 
cessfully, Add1Cmd calls Tcl_NewIntObj, 
which creates a new Tcl_Obj and stores 
an integer in it, automatically converting 
the integer value to a string. Then Tcl_Set- 
Objkesult stores that object as the inter- 
preter’s result and Add1Cmd returns with 
a successful completion code. 


A New Looping Command 

To illustrate how straightforward it is to 
define a new control structure in Tcl, the 
next example implements a new com- 
mand called Joop. Listing Four shows 
how Joop is used. The loop command 
takes as arguments the name of a vari- 
able, two integers, and a Tcl script. It 
sets the variable to each integer value in 
the given range and invokes the Tel script 
once for each value. Listing Five is the 
command procedure that implements the 
loop command. 

LoopCmd uses several new Tcl proce- 
dures. Tcl_ObjSetVar2 sets the value of a 
Tcl variable, given a Tcl_Obj containing 
the variable’s name and a Tcl_Obj con- 
taining the value. Tcl_EvalOvj is the main 
entry point to the Tcl interpreter: It is 
called once for each iteration of the loop 
to evaluate the loop body. Errors can po- 
tentially occur in Tcl_ObjSetVar2 or 
Tcl_EvalOgj. If this happens, the proce- 
dure leaves an error message in interp’s 
result and returns TCL_ERROR; this caus- 
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es LoopCmd to return an error as well. 
Tcl_DecrRefCount frees the object point- 
ed to by valuePtr if it couldn’t be assigned 
to the variable. 

This example demonstrates three fea- 
tures of Tcl: 


e How new control structures can be im- 
plemented as extensions. This is an un- 
usual feature of Tcl that is present in 
few, if any, other languages. 

e How the command procedures define 
the meanings of their arguments (two 
arguments are treated as integers, one 
as a variable name, and one as a Tcl 
script). 

e How extensions can access the internals 
of a Tcl interpreter, in this case by read- 
ing and writing variables. 


LIN hs 


NTE Sy, 





More information about Tcl library pro- 
cedures is available at http://www.scriptics 


.com/man/. 
More On Tcl_Obj Structures 


In versions of Tcl before Tcl 8.0, there 
were no 7c/_Obj structures. Instead, all in- 
formation was represented with C strings. 
Each command procedure received an ar- 
ray of strings containing the words of the 
command and returned a string result in 
the interpreter instead of a Tcl_Obj. Vari- 
able values, scripts, and virtually all oth- 
er things in Tcl were represented with 
strings. 

Strings provided a simple and power- 
ful way of passing information around, 
and they made it easy to write extensions 
that connect Tcl with almost anything— 
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but they were not efficient. For exam- 
ple, consider set x [expr $x * 2/, which 
multiplies a variable by two. The value 
of the variable was stored as a string, so 
the expr command had to convert its ar- 
guments from strings to integers, perform 
the multiplication, then convert the re- 
sult back to a string. If the command was 
executed repeatedly then the string con- 
versions happened each time. A similar 
problem occurred with scripts: Each time 
the body of a looping command like Joop 
was executed, it was passed into the Tcl 
interpreter as a string, so the Tcl inter- 
preter had to parse the commands and 
words from scratch. Consequently, most 
of the execution time for Tcl scripts was 
spent converting to and from strings. 
Tcl_Objs were introduced in Tcl 8.0 to 
eliminate unnecessary string conversions; 
they are now used in most of the places 
where strings were used in earlier versions 
of Tcl. A 7c/_Obj stores a string plus an in- 
ternal representation; see Figure 3. If the 
value of a Tcl_Obj is required in a form 
other than a string, then the value is con- 
verted and the other form is saved as the 
internal representation of the Tc/_Obj. If 
the value is needed again in this other form, 
it can be retrieved immediately from the 
Tcl_Obj without recomputing it from the 
string. For example, the library procedure 
Tcl_GetIntFromObj creates and reuses in- 
teger internal representations. The value of 
a Tcl_Obj is defined by its string represen- 
tation: If the string value of a Tcl_Obj is 
4.800, it might be converted to a floating- 
point internal representation of 4.8, but it 





Figure 3: In Tcl 8.0 and later 
versions, Tcl_Obj structures are used 
to represent most data. A Tcl_Obj can 
hold a string value (with length) and 
also an equivalent but more efficient 
internal representation. Small 
internal representations can be stored 
directly in the Tcl_Obj; larger values 
are allocated separately with a pointer 
stored in the Tcl_Obj. The type field 
identifies the current form of the 
internal representation and makes the 
internal representation mechanism 
extensible. The reference count allows 
Tcl_Objs to be shared. 


Dr. Dobb’s Journal, June 1999 


will still print as 4.800. The internal repre- 
sentation just caches the result of a string 
conversion to improve performance. 

If an internal representation is available 
when a new Tc/_Obj is created, such as 
an integer result from an expr command, 
it is stored in the new 7Jcl_Obj and the 
string value of the Tc/_Obj is left empty. 
If the value is used only as an integer 
(such as in subsequent expr commands), 
then no string value is ever created. If the 
string value is needed, then at that time 
the integer value is converted to a string; 
both the integer and string values are 
stored in the 7c/_OUj so that either can be 
used in the future without any additional 
conversions. 

The Tcl_Obj mechanism allows for 
many different kinds of internal repre- 
sentations. For example, lists like the ar- 
gument to foreach are converted to an 
internal representation that is an array of 
Tcl_Objs; this allows faster access than 
earlier versions of Tcl, which had to re- 
scan the list from its beginning to retrieve 
any element. Before a Tcl script is exe- 
cuted, it is converted to an internal rep- 
resentation consisting of bytecodes that 
allow rapid execution. If a script is exe- 
cuted repeatedly, such as a loop body, 
subsequent executions are even faster be- 
cause the script doesn’t need to be parsed 
again; this provides a substantial speedup 
in Tcl 8.0. 

To distinguish between different kinds 
of internal representations, each Tcl_Obj 
contains a field indicating the type of its 
internal representation. If a particular 
type of internal representation is desired 
(a list, for instance) and another type is 
present (bytecodes), then the existing in- 
ternal representation is discarded and re- 
placed with the desired type (a Tcl_Obj 
can hold only one internal representa- 
tion at a time). New types can be de- 
fined by providing a few methods to im- 
plement that type, such as a method to 
copy the internal representation, one to 
free the internal representation, and one 
to regenerate the string value corre- 
sponding to the internal representation. 
Extensions can define new types to 
speed up their own conversions. 

The Ycl_Obj mechanism retains all the 
flexibility of using strings for representing 
data, while improving performance dra- 
matically. I've found that most scripts ex- 
ecute two to five times faster under Tcl 
8.0 than under previous versions. This 
gives Tcl about the same speed as Perl 
and other scripting languages that don’t 
have Tcl’s easy extensibility. 


Sample Extensions 

Tcl’s extension mechanism has allowed 
Tcl to be used for a variety of applications, 
including the real-time control for oil plat- 
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forms, automated hardware testing, fac- 
tory automation, web content generation, 
financial trading applications, and char- 
acter animation in motion pictures such 
as Toy Story and A Bug’s Life. In many cas- 
es, extensions are created for internal use 
within an organization. In addition, there 
are numerous extensions freely available 
via the Web (visit http://www.scriptics 
.com/resource/). Examples of open-source 
extensions include: 


e Oratcl and Sybtcl, by Tom Poindexter, 
provide an easy way to access the pop- 
ular Oracle and Sybase databases 
(http://www.nyx.net/~tpoindex/tcl. html). 
TclX, by Mark Diekhans and Karl 
Lehenbauer, provides access to many 
of the UNIX kernel facilities. It also ex- 
tends the Tcl facilities for manipulating 
lists, adds its own new data type (keyed 
lists), creates new control structures for 
scanning files, and adds a profiling 
mechanism to Tcl (http://www.neosoft 
.com/TclX/). 

lincr Tcl], by Michael McLennan, adds 
object-oriented programming to Tcl. [incr 
Tcl] adds a class mechanism with ob- 
jects, methods, and inheritance (http:// 
www.tcltk.com/itcl/). 

Expect, by Don Libes, simulates users 
typing at terminals, making it possible 
to automate terminal-oriented applica- 
tions. It adds new control structures that 
associate Tcl scripts with patterns of out- 
put generated by the application (http:// 
expect.nist.gov/). 

Tk, a GUI toolkit I created, lets you 
create GUIs from Tcl. It also adds an 
event binding mechanism to associate 
Tcl scripts with UI events such as but- 
ton clicks and keystrokes (http://www 
scriptics.com/software/download.html). 


Conclusion 

Extensibility is one of the key reasons for 
Tcl’s success. For example, extensibility 
made it easy to implement the Tk toolkit, 
which is one of the most common rea- 
sons people give for using Tcl. Extensi- 
bility also lets Tcl be used as a general- 
purpose automation tool—it can be 
connected to, or embedded in, almost any- 
thing and used to automate previously 
manual tasks. For example, Tcl has be- 
come the language of choice for auto- 
mated hardware and software testing. Last- 
ly, extensibility has made Tcl into a 
powerful integration platform where the 
base language is augmented with exten- 
sions to connect to disparate resources, 
and Tcl scripts are written to coordinate 
the resources. 
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Listing One 


set a 43 

set b Sa 

set c [expr S$at1@] 

puts "The value of c is $c" 
puts {Lunch costs $6.95} 


Listing Two 


foreach i {2 4 6 8 10} { 
puts "$i squared is [expr $i*$i]" 


I 


Listing Three 


#include <tcl.h> 
int AddiCmd(ClientData dummy, Tcl_Interp *interp, int objc, 
Tcl_Obj *objv[]) { 
int i; 
if (objec != 2) { 
Tcl_SetResult(interp, "wrong number of arguments", TCL_STATIC) ; 
return TCL_ERROR; 
} 
if (Tcl_GetIntFromO0bj(interp, objv[1], &i) != TCL_OK) { 
return TCL_ERROR; 
} 
Tcl_SetObjResult (interp, Tcl_NewIntObj(i+1)); 
return TCL_OK; 


Se 


e e 
Listing Four 
set factorial 1 
loop ii17 { 
set factorial [expr $factorial*$i] 


} 


puts "7 factorial is $factorial" 


Listing Five 
#include <tcl.h> 


int LoopCmd(ClientData dummy, Tcl_Interp *interp, int objc, 
Tcl_Obj *objv[]) { 
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¢ Base TCP/IP includes TCP, UDP, ICMP, IP, ARP, 
ETHERNET, SLIP, PPP, and PING 
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int current, last, code; 
Tel_Obj *valuePtr; 


if (objec != 5) { 


} 


Tcel_SetResult(interp, "wrong number of arguments", TCL_STATIC) ; 
return TCL_ERROR; 


if (Tcl_GetIntFromObj(interp, objv[2], &current) != TCL_OK) { 


} 


return TCL_ERROR; 


if (Tcl_GetIntFromO0bj(interp, objv[3], &last) != TCL_OK) { 


7 


return TCL_ERROR; 


for ( ; current <= last; currentt+t+) { 


} 


valuePtr = Tcl_NewIntObj (current) ; 
if (Tcl_ObjSetVar2(interp, objv[1], (Tcl_Obj *) NULL, 
valuePtr, TCL_LEAVE_ERR_MSG) == NULL) { 

Tcl_DecrRefCount (valuePtr) ; 
return TCL_ERROR; 

} 

code = Tcl_EvalObj(interp, objv[4]); 

if (code != TCL_OK) { 
return code; 


} 


return TCL_OK; 
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FINALLY, a problem-tracking system that takes the gamble out of software releases. 
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Thomas Tewell 


w alk into any consumer electronic 
store and, faster than you can say 
“I don’t want the service agree- 
ment,” digital video camcorders 
from the likes of Sony, Panasonic, Sharp, 
Canon, Samsung, and JVC, are thrust into 
your hands. After examining a few of these 
gadgets, the engineer in me had some 
questions: 





e Are all digital camcorders compatible 
with one another? (Mostly.) 

¢ Do they all use the same tapes? (If they 
sport a DV logo they do.) 

e What outputs (and/or inputs) are on 
these cameras? (On most, analog video 
out, analog audio out, and digital I/O.) 

e What is the digital I/O format? (IEEE 
1394, also known as “FireWire” or 
“TLink.”) 


IEEE 1394. Now there’s something I 
know about (see, for instance, my article 
“FireWire: The IEEE 1394 Serial Bus” DDJ, 
September 1997). Based on my experience 


Thomas is a software engineer at Se- 
quoia Advanced Technologies Inc., 
which specializes in 1394 software de- 
velopment. He can be reached at thomas 
.tewell@seqadvtech.com. 
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with 1394, I bought Sony’s DCR-PC10, a 
digital video (DV) camcorder that uses a 
cassette slightly smaller than a DAT tape 
to store 1.5 hours of video/audio. The au- 
dio is 16-bit stereo sampled at 44.1 kHz. 
The video format is NTSC with 720x480 
resolution stored at 30 frames per second. 





The camera has both analog (which can 
be directly connected to any TV or VCR) 
and digital outputs. It does not have ana- 
log inputs, which prevents you from mak- 
ing some really good copies of The Lion 
King. However, the PC10 does let you in- 
put digital data, which lets you edit video 
on a PC, then rerecord it using the cam- 
corder. (For more information on the 
PC10, see http://www.sel.sony.com/SEL/ 
consumer/ss5/office/camcorder/digit- 
alvideoproducts/dcr-pc10_specs.shtml.) 
Because my PC has a 1394 card and the 
Sony PC10 a 1394 port, it seemed I should 
be able to connect the camcorder to the 
PC and grab pictures— especially since 
Windows 98 boasts “embedded” 1394 soft- 


igital/ Video 
amcorders 





ware support. I plugged the PC10 into the 
1394 card (which requires a special 4-pin- 
to-6-pin cable) and Windows 98 reported 
“New Hardware Found.” With my heart 
racing (I told you I was an engineer), I wait- 
ed for Windows 98 to continue loading so 
I could get on with the task of grabbing 
pictures from my PC10. 

Alas, that was the peak of my con- 
sumer electronic “high,” as Windows 98 
popped up a dialog box saying “Add 
New Hardware Wizard 1394\A02D&- 
10001.” I pressed the Next button and 
was instructed to “Search for the best 
driver for your device.” It turns out that 
the “best driver” wasn’t on my Windows 
98 CD-ROM. Three hours later I realized 
that Windows 98 didn’t come with the 
software I needed to make my PC10 
work— even though it does come with 
a DirectShow DV Codec for converting 
DV frames into pictures. I eventually 
found the DV Codec (QDV.DLL)— writ- 
ten to encode/decode DV data from cam- 
corders like the PC10—in the Win- 
dows\System directory. What was going 
on? Three days later and enough phone 
calls to make AT&T smile, I got my an- 
swer from a Microsoft support person 
who told me I needed a DV camcorder 
device driver. It turns out that the DV 
Codec only decodes DV frames— it 
doesn’t grab them from the 1394 bus. 
Feeling somewhat sheepish, I asked 
where to get a DV camcorder device 
driver. “I’m not sure,” came the reply, 
“but that is definitely what you need.” 

Now, I am an engineer— a software 
engineer — and this sounded like a chal- 
lenge. Consequently, I decided then and 
there to write my own 1394 DV cam- 
corder driver. With nothing but the driv- 
er source code I present here, you will 
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(continued from page 74) 
be able to connect a DV camera to a Win- 
dows 98/1394-equipped PC and grab pic- 
tures. Along the way, I'll share the trials 
and tribulations of what it’s like to de- 
velop DV-based software. 

In designing the code, I partitioned the 
process into several distinct pieces: 


e Writing a skeleton WDM 1394 driver that 
simply establishes communication with 
the camcorder. 

e Capturing DV video frames from the 
1394 bus. 

e Sending those DV video frames on to 
the DV Codec to be turned into pictures. 


The resultant software is a WDM 1394 
DV camcorder driver called “DDJDV- 


CAP.SYS,” with a corresponding .INF file 
and a Win32 console utility that controls 
the DV camcorder driver. In this article, 
I'll present the WDM 1394 driver and all 
the files necessary for a complete 1394 
class driver package (available electroni- 
cally; see “Resource Center, page 5). In 
future articles, I'll present the code for cap- 
turing DV video data and sending a DV 
frame to a Win32 application. 


DDJDVCAP.SYS: A Guided Tour 

In addition to a Windows 98-based PC, 
the hardware consists of a Texas Instru- 
ments OpenHCI 1394 PCI bus controller 
and a Sony DCR-PC10 camcorder. When 
I plug the camcorder into the bus con- 
troller, the 1394 bus resets— normal for 
whenever a device connects to the bus. 


OSR DDK is a powerful suite of tools that 
Ff, makes NI driver development quicker 
_ and easier than ever. You simply compile . 


Wp, your driver, and OSR DDK goes to work 


validating and tracing. There's no new 
code to write. No special functions to 
call. No extra work. 


* You can use OSR DDK immediately. It 
works with the standard NT DDK and 

= your choice of debugger, so you dont 
have to waste your time learning any 

- new APIs. You work right in your existing 7 : 

= driver development environment. 4 


= You can dynamically change the depth of 4 
- support OSR DDK provides. It's easy to 


= = add, but easy to remove, too. 





| _ An OSR exclusive feature that goes way = 


= beyond simple checks. OSR DDK learns 

» about a driver as it runs and makes 

. intelligent cross-checks of parameters 
the driver passes to function. 


— Talk about instant gratification. Visit 


<9 wwwoosrddk.com and you can order and 
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download your own OSR DDK right now. 
lt just may be the most valuable set of 
driver tools you ever own. 


1-888-677-4264 
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IEEE 1394 is completely plug-and-play 
(PnP). You can connect/disconnect a de- 
vice at any time. The bus reset causes 
the 1394 bus driver (1394BUS.SYS) to 
check what has been connected/dis- 
connected from the 1394 bus. Whenev- 
er 1394BUS.SYS finds a new 1394 de- 
vice, it creates a DeviceObject (an official 
WDM driver structure through which all 
device communication becomes possi- 
ble), then registers the device with the 
PnP system. The PnP system checks in 
the registry under My Computer\ 
HKEY_LOCAL_MACHINE\Enum\ 1394 
for an entry that matches the 1394 de- 
vice’s signature. If the 1394 device’s sig- 
nature is found in the registry, it looks 
in the key labeled “Driver” and loads the 
specific device driver indirectly pointed 
there. If it does not find the 1394 de- 
vice’s signature in the registry, it pops 
up a dialog box and asks you to insert 
a disk, CD-ROM, or path where the ap- 
propriate device driver can be found. 

The first step to writing a 1394 WDM 
driver is creating an .INF file that contains 
specific information regarding the target- 
ed 1394 device. This .INF file is used by 
the PnP system to copy the device driver 
to the appropriate directory (\WIN- 
DOWS\SYSTEM32\DRIVERS) and update 
the appropriate registry entries needed to 
load the driver once the Sony PC10 is con- 
nected to the system. You need to spec- 
ify what “Class” you wish to be installed 
under, as well as the PnP ID. In the case 
of the Sony PC10, the PnP ID is 1394\ 
A02D&10001, and you are creating a new 
class called DDJDVCap for our project 
(see DDJDVCAP.INF; available electroni- 
cally, for more details). 

There are a few mandatory routines 
that you must supply as a WDM driver. 
The first is DriverEntry, which is the first 
function called after a driver is loaded. 
The driver loader creates and supplies a 
DriverObject as a parameter to the Driv- 
erEntry routine. A DriverObject is another 
one of those “official” WDM driver struc- 
tures. For the most part, DriverObject is 
a table of pointers to the various routines 
in your WDM device driver. The next two 
mandatory routines (mandatory for a PnP 
WDM driver) that we fill out in the skele- 
ton driver are DriverObject->DriverEx- 
tension->AddDevice and DriverObject-> 
MajorFunction[IRP_M]_PNP}]. 

The AddDevice routine is called when 
the 1394 device specified in the driver’s cor- 
responding registry entry is plugged into 
the 1394 bus controller. It is in Add-Device 
that you create the symbolic link that lets 
Win32 applications call the driver. The sym- 
bolic name for my driver is \DosDe- 
vices\ DDJDVCAP. To open this driver, the 
Win32 application will use the Win32 func- 
tion CreateFile with “\\\\ .\\DDJDVCAP” 
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as the name parameter (see DDJDVCAP.C, 
available electronically). AddDevice then 
creates a DeviceObject, the structure used 
to represent our device to the I/O Manag- 
er. I then attach DeviceObject to the 1394 
camcorder DeviceObject supplied to my Ad- 
dDevice routine. 

As mentioned previously, my AddDe- 
vice routine is called whenever a Sony 
PC10 is connected to the 1394 bus. Add- 
Device is called with a DeviceObject as 
one of its arguments. This particular De- 
viceObject is the DeviceObject created by 
1394BUS.SYS when it detected the Sony 
PC10 on the 1394 bus. I must use this 
DeviceObject whenever I send a 1394 re- 
quest/command to the Sony PC10. It is 
important to note that since each enu- 
merated 1394 device has only a single 
sale in ws Bice oo will be only 


| a ie sioner cPonl (NonPaacavoo. 


one driver to which this DeviceObject 
will be passed via PnP. While architec- 
turally there is nothing that prevents mul- 
tiple drivers from using the same De- 
viceObject to execute 1394 requests to a 
particular device, there is no realistic 
mechanism that allows a driver— under 
the context of the WDM PnP system — 
to get passed a DeviceObject for a par- 
ticular 1394 device that has already been 
assigned and passed to an existing 1394 
driver. In short, one 1394 device, one 
WDM driver. 

Once the 1394BUS.SYS has generated 
the DeviceObject, you will attach it to the 
local DeviceObject that you will create 
for your driver so you can field all of the 
PnP messages intended for the Sony 
PC10. I could use the 1394BUS.SYS gen- 
erated DeviceObject for the PC10 direct- 


sizeof (IRB)) 


Fill out IRB for desired 1394 function.. 


NextIrpStack = 


IoGetNextIrpStackLocation(Irp); 


Next IrpStack-)MajorFunction = IRP_MJ_INTERNAL DEVICE_ CONTROL) 


_ NextIrpStack->Parameters.DeviceloControl.loControlCode = 


IOCTL_1394_ CLASS; 


_ Next IrpStack >Parameters.Others. Argumentl = Irb; 


..Setup IRP Completion Routine. 


: status = ToCal LDriver (Devicelixtension- >SonyPC1@DeviceObject, 1p): 


Example 1: eens @15 04 request. 








ly if I just wanted to send requests to the 
PC10, but since I want to intercept requ- 
ests from other drivers (ike the PnP sys- 
tem), I must attach the PC10’s De- 
viceObject to the local DeviceObject. This 
is done by creating a local DeviceObject 
with JoCreateDevice(), then by using JoAt- 
tachDeviceToDeviceStack( OurDeviceObject, 
SonyPC 10DeviceObject). The new De- 
viceObject returned from the JoAttach- 
DeviceToDeviceStack is what I now use as 
my Sony PC10 DeviceObject whenever I ex- 
ecute 1394 requests. 

At this point, I have the DeviceObject that 
I can use to send 1394 requests/commands 
to the Sony PC10. What commands can I 
send? Table 1 lists the 1394 functions avail- 
able from the WDM 1394 driver interface. 

For the Sony PC10 video capture driv- 
er, I will only use a small subset of these 
1394 functions. In fact, I could get away 
with only using six of them (highlighted 
in red), but will probably end up using 
10 (additional four highlighted in green). 





Example 2: 1594 config ROM unique 
identifier format. 
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The basic structure used to execute 1394 
requests is called the I/O Request Block 
(IRB). The IRB (and all other 1394 per- 
tinent information) can be found in the 
file 1394.H, located in the \98ddk\inc\ 
win98 directory of the Windows 98 DDK. 
The IRB is filled out and shipped off via 
an I/O Request Packet CIRP). IRPs and 
I/O Stack Locations are the primary driv- 
er communication structures of WDM. 


IEEE 1394 is 
completely 
plug-and-play 





IRBs were created specifically for the 
1394 WDM driver interface. Example 1, 
for instance, sends the IRP (with IRB in 
tow) to 1394BUS.SYS for 1394 request 
execution. 


The Win32 Interface 

The DriverObject->MajorFunction[IRP_ 
MJ_DEVICE_CONTROL] field contains the 
pointer to the function that DDJDV- 
CAP.SYS uses to field Win32 application 
requests. This entry point in my driver is 
called DDJDV_Dispatch. When the Win32 
test utility DDJDVTST.C (Listing One) is- 
sues requests to DDJDVCAP.SYS via the 
DeviceloControl() function, DD/JDV_Dis- 
patch fields the request. My driver receives 
an IRP which contains the information 
necessary for us to carry out the request. 
An IRP is the most basic of I/O Manager 
structures and is the way WDM drivers 
communicate. [np->AssociatedIrp.System- 
Buffer contains the incoming data struc- 
ture that corresponds to the /pinBuffer 
and /pOutBuffer parameters of Devicelo- 
Control( ). 

The data supplied in these buffers is 
doubly buffered between Ring 3 (Win32) 
and Ring 0 (WDM). The data supplied 
via /pInBuffer of DeviceloControl() is 
copied into an intermediate memory 
space and supplied to DD/JDV_Dispatch 
by way of the /rp->AssociatedIrp.Sys- 
temBuffer field. When you return from 
the function, any data that you wrote to 
Irp->AssociatedIrp.SystemBuffer is then 
copied into the /pOutBuffer supplied via 
DeviceloControl. In this manner, WDM 
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Table 1: WDM 1394 function reference 


drivers can communicate data back and 
forth between Ring 3 and Ring 0. (In fu- 
ture articles, Pll examine other ways of 
facilitating data exchange that don’t re- 
quire the use of an intermediate buffer.) 

Currently, the only Win32 function that 
DDJDVCAP.SYS supports is GET_NODE 
_UNIQUE_ID_CODE, a custom com- 
mand that simply issues a 1394 Async 
Read command to the DV camcorder’s 
config ROM space to fetch its serial num- 





Listing One 


ber. This number is really defined as a 
1394-specific 64-bit node unique iden- 
tifier in the format in Example 2. This 
node unique identifier is returned to the 
caller and then displayed via the DDJDV- 
TST utility (Listing One). 


The Binaries 

The code provided here (available elec- 
tronically in both source and executable 
form) is a fully functional skeleton 1394 





WDM driver. When you plug a 1394 cam- 
corder into the 1394 card for the first time, 
Windows 98 will ask for the driver. Insert 
a floppy with the .INF and .SYS file and 
answer all the questions. The DV driver 
will then load. You can then run the .EXE 
file, which will read the serial number 
from the DV camcorder and display it on 
the screen. The device driver is completely 
PnP, loaded whenever a 1394 DV cam- 
corder is plugged in, and unloaded when 
the 1394 DV camcorder is unplugged or 
turned off. I’ve also included the Win32 
console application that calls the driver to 
get the serial number and prints it out to 
the screen. 

The files available electronically in- 
clude DDJDVCAP.C (the 1394 WDM cam- 
corder driver); DDJDVCAP.H (the 1394 
WDM camcorder driver header file); 
DDJDVW32.H (the header file used by 
Win32 applications that wish to com- 
municate with the DV driver); DDJDV- 
CAP.RC (the resource file for the cam- 
corder driver); DDJDVCAP.INF (the .INF 
file used to install the driver whenever 
a DV camcorder is connected to the sys- 
tem); and DDJDVTST.C (a Win32 con- 
sole application that interfaces to the DV 
camcorder driver). 

These files constitute the pieces nec- 
essary for a complete 1394 class driver 
package. These source files are com- 
pletely operational. In future articles, I'll 
examine what DV video frames look like, 
and what you have to do to capture 
them before sending a DV frame to a 
Win32 application for conversion into a 
picture. 


DDJ 


{ 
printf("DDJDVCAP Driver Not Loaded!!\n\n") ; 

fF, sretetetettatatteetataratetatatatetataetatatatatabetaatetstatatatatananaiatananaataaamamaamamaaematmamtatatataaetemmenaeae */ exit (1); 

/* Filename: DDJDVTST.C */ 
/* Description: Dr. Dobb's Journal DV Frame Capture Driver Project */ [ [ ooseneassemsesedesese Sees ter Saas Hen sone neha om eee esas 
/* Win32 Console Application for testing the DV Driver */ // Send command to DV Driver to read Node Unique ID from Camcorder. 
[atte aiatetetataietereratetatetatatatatetanaetatataatatenataatataetatatanaaaatamnataamanaamameamaaaamenaate */ // This will result in DDJDV_Dispatch being called in driver. 

/* (C) Copyright 1994-1998 by Sequoia Advanced Technologies, Inc. Hf  ‘ JZfsneecnneeesamesteemosss pos eeSren mess retete eee oes easSesssSeet 
/* http://www.seqadvtech.com */ dwRet = DeviceIoControl(hDev, GET_NODE_UNIQUE_ID_CODE, NULL, @, 

/* All Rights Reserved. */ ; . inBuffer, 8, &dwOutCount, NULL); 
%----------------------------------------------------------------------- */ | katatatatatatatatatatatatatatatatatatatatatatatatatatetataateeatatanataaataatataneaanaaaaianalaaaneaaianeraamianananel 
#define STRICT // Put in little-endian numeric format 

#include <stdlib.h> | fasseen eee ah Seeassenass Sees esa See ear eee rem nemene 


#include <stdio.h> 

#define WIN32_LEAN_AND_MEAN 
#include <windows.h> 
#include <winioctl.h> 
#include "ddjdvw32.h" 


bswap(ULONG value) 
i 


__asm mov eax, value 
__asm bswap eax 


main(int argc, char *argv[]) 


HANDLE hDev; 
DWORD dwOutCount; 
DWORD inBuffer [2]; 
DWORD dwRet; 


inBuffer[@] = bswap(inBuffer[@]); 
inBuffer[1] = bswap(inBuffer[1]); 


printf("Camcorder Vendor ID = %x\n",(inBuffer[®] >> 8)); 
printf("Camcorder Model ID = %x%.8x\n", (inBuffer[@] & OxFF),inBuffer[1]); 


CloseHandle(hDev) ; 
return(@); 


printf("Dr. Dobb's DV Camcorder Driver Test Utility\n\n") ; 
// 


// Open our DV Camcorder driver, if it is loaded. 


// This will result in DDJDV_Create being called in the driver. 
// 


hDev = CreateFile("\\\\.\\DDJDVCAP", GENERIC_WRITE | GENERIC_READ, 
FILE_SHARE_WRITE | FILE_SHARE_READ, NULL, OPEN_EXISTING, @, NULL); 


if(hDev == INVALID_HANDLE_VALUE) 
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Robert D. Grappel 


was recently part of a project devel- 

oping a system for aircraft pilots to ac- 

cess the national ground weather-radar 
= database while in flight. This weather- 
radar graphical database is generated 
from the outputs of the FAA and National 
Weather Service network of radars cov- 
ering the continental United States and 
is updated every five minutes. Each pix- 
el in the database covers a square mea- 
suring two kilometers (about one nauti- 
cal mile) on a side. The content of each 
data pixel is a measure of the radar re- 
flectivity measured at that location — 
radar reflectivity is proportional to the 
water content in the atmosphere (the 
precipitation rate). 


Robert is a Staff Member of the Air Traffic 
Surveillance group at MIT’s Lincoln Lab- 
oratory. He can be contacted at grappel@ 
ll.mit.edu. 
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This graphical database is available 
through several commercial vendors — 
it’s what you see displayed on The 
Weather Channel or during typical TV 
weather reports. Our system, on the oth- 
er hand, provides a low-speed digital 


* x 


eotreereepinatartaeatenemanammnneemneetcisteninnen = O um 


datalink connection from an FAA ground 
computer to an avionics computer/dis- 
play located in the aircraft cockpit. The 
pilot can request the uplink of a portion 
of the weather database centered on a 
specified location (the aircraft’s current 
position, a particular airport, and so on) 






and with a range of up to 200 nautical 
miles from the centerpoint. (The pilot 
can also request images from past 
databases to observe storm motion at a 
particular location.) The actual graphical 
data uplinked to the aircraft for a given 
map image consists of an array of 
256x256 two-bit pixels, compressed to 
about 3500 total bits using a proprietary, 
lossy technique developed for the FAA. 
The aircraft computer/display avionics 
(effectively a 25-MHz, 486-based em- 
bedded PC running DOS) decompress- 
es the uplinked image and displays it 
with the weather intensities color-coded 
to parallel an airborne weather radar 
(light precipitation in green, medium pre- 
cipitation in yellow, and heavy precipi- 
tation in red). Figure 1 shows the in- 
strument panel of the test aircraft (Cessna 
172 Skyhawk). The datalink display/key- 
board is the ARNAV MFD (multifunction 
display) 5100 CRT in the radio stack Gust 
right of center). The ARNAV MFD 5010 
avionics computer is located at the far 
right, just behind the right control wheel. 
(In a normal installation, the MFD 5010 
would be installed out of sight.) The 
datalink “modem” function is performed 
by the Bendix/King KT70 transponder 
just below the MFD 5010. The weather 
display shown here is an actual North-up 
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weather image center on the Atlanta, 
Georgia airport. The weather image is 
showing a 50 nautical mile radius, with 
some severe storm cells in the area. 
Figure 2, on the other hand, is a close- 
up of the datalink demonstration unit 
which groups the three datalink com- 
ponents together. The weather display 
is the same as in Figure 1. The soft-key 
labels on the right side of the display in- 
dicate the Traffic Information Service 
(TIS) and Weather Request (WXREQ) pilot 
inputs. The yellow TIS ALERT indication 
tells the pilot that there is another air- 
craft nearby that may be on a conflict- 
ing course. (The pilot presses the TIS 
button to get a display showing where 
other aircraft are in relation to the pi- 


lot’s aircraft.) Note the PCMCIA slot on 
the ARNAV MFD 5010. 

The weather map graphical image to 
be uplinked in response to a pilot’s re- 
quest is simply windowed from the se- 
lected national database. The image is 
oriented North-up, as is conventional for 
maps. This map-like display is desirable 
when pilots specify an airport or other 
landmark as the display centerpoint, but 
it is undesirable when pilots want the 
map centered on the aircraft’s current 
position. Pilots would like the map dis- 
play to match what is visible out of the 
aircraft windshield— rotated so that the 
aircraft's current heading points to the 
top of the display. However, the ground 
weather computer doesn’t know the 
heading of every aircraft that might make 
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a request, and aircraft can and do ma- 
neuver after the request has been made 
and processed. What we needed was a 
way for the avionics computer/display 
to rotate the uplinked North-up weath- 
er image to correspond with the aircraft's 
current heading. 

Clearly, performing such a rotation can 
take a lot of processing. There are 65,536 
pixels to be recomputed for each weath- 
er map image. We decided that the avion- 
ics display would need to be refreshed 
about once per second— and the rela- 
tively slow airborne computer is likely to 
be busy doing other tasks (graphical 
weather display is only one of its func- 
tions). Performing a map image rotation 
would have to take only a fraction of a 
second to be practical. All the software 
had to be written in standard C and be as 
processor/operating system independent 
as possible. In this article, I’ll describe the 
algorithm we developed to efficiently per- 
form this rotation of graphical weather 
maps. I'll also suggest some techniques 
and approaches that you could use to op- 
timize other time-limited computer appli- 
cations. 


Warning...Trigonometry Ahead! 

The first step in developing the map ro- 
tation algorithm is to convert each map 
pixel’s Cartesian row index (y-coordinate) 
and column index (x-coordinate) into po- 
lar coordinates. In polar coordinates, the 
location of each pixel is defined by its dis- 
tance (R) from the center point of the map 
and an angle (A) around the map center 
point. Exactly how the angle A is to be 
measured is determined by convention. 
Math books define the polar angle as mea- 
sured counterclockwise from the positive 
x-axis (due East). Map makers, however, 
measure the angle clockwise from due 
North (the positive y-axis). Since we’re 
doing map rotation here, I'll use the map- 
maker’s convention for the polar angle A. 
Hence, the Cartesian to polar conversion 
equations are: 

X=R°* sin(A) 
Y=R-*cos(A) 


Rotating the weather map around its 
center point by the angle B leaves R un- 
changed, while the angle A changes to 


A+B. The equations for the rotated pixel 


row index (Yrot) and the rotated pixel col- 
umn index (Xrot) are simply: 


Xrot=R °* sin(A+B) 
Yrot=R* cos(A+B) 


Applying the standard formulas for the 
sine and cosine function of the sum of 
two angles and a bit of algebra yields: 


Xrot={R * sin(A) * cos(B)}+{R * cos(A) * sin(B)} 
Yrot={R * cos(A) * cos(B)}H{R * sin(A) * sin(B)} 
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At first, these rotation equations don't 
look too promising. Performing all that 
math for each pixel in the rotated map 
will take a lot of processing time. We'll 
have to simplify this before having an ef- 
ficient implementation. 


Simplify...Simplify...Simplify! 

The first simplification we can perform 
on the rotation equations is to recognize 
that the rotation angle B is a constant for 
each pixel in the map and needs to be 
calculated only once for the entire map. 
Let S=sin(B) and C=cos(B). The second 
simplification is to notice that the rota- 
tion equations are actually operating on 
the initial, unrotated, Cartesian X and Y 
coordinates. Hence, we can rewrite the 
rotation equations for each pixel as: 


Xrot=(X* C)+(Y °S) 
Yrot=(Y *C)-(X°S) 


At this point, we’re down to four mul- 
tiplications, one addition, and one sub- 
traction per map pixel. However, we cai) 
do better. Let’s attack the multiplications, 
since there are more of these than any 
other operation (and multiplication tends 
to be a more time-consuming operation 
than addition or subtraction). 

The values of X and Y that we will see 
are just the indices of the map pixels. 


Since the weather map is square (sym- 
metric in X and Y), we can precompute 
the results of the multiplications for one 
edge of the map and store them in two 
tables. If you assume that the map has N 





Figure 1: Instrument panel of test aircraft (Cessna 172 Skyhawk). 





pixels on a side, the precomputation code 
looks like Figure 3(a) and the rotation 
equations are reduced to Figure 3(b). 
That’s looking much better. We’ve re- 
placed all the multiplications with table 
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lookups. Note, however, that each of the 
rotation equations contains one table 
lookup based on row index (Y) and a 
second table lookup based on column 
index (X). Since the overall rotation al- 
gorithm is going to iterate over all map 
rows and all map columns, we can sim- 
plify the table lookups even further. We 
don’t need to redo the row lookups for 
each pixel on a given row. Figure 4 pre- 
sents pseudocode for the basic loop 
structure of the improved rotation algo- 
rithm. 
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Figure 2: Close up of the datalink demonstration unit which groups the three 


It’s Never Quite that Simple! 
Unfortunately, things aren’t quite as easy 
as that. First of all, consider the square 
input map rotated by some angle B that 
is not a multiple of 90 degrees. The cor- 
ners of the input map stick out over the 
edges. There are some input map pix- 
els that will not get mapped to a rotat- 
ed output pixel. Also, the corners of the 
rotated map stick out beyond the edges 
of the input map. You can deal with the 
first situation by simply initializing the 
output map ahead of time to a fixed null 


SES ease UF 


datalink components together. The weather display is the same as in Figure 1. 


center point of a map edge 
adjust to center of a given pixel 


center-point pixel coordinate 
precompute Cosine table 
precompute Sine table 





Figure 3: (a) Precomputation code; (b) rotation equations. 
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value. (C’s memset function performs 
this whole-array initialization very effi- 
ciently.) In order to deal with the sec- 
ond problem, you need to insert some 
tests in the basic rotation algorithm to 
check that the rotated coordinates (Xrot, 





Yrot) do, in fact, denote valid map pix- 
els. We'll also need a couple of if state- 
ments inserted before the last line of the 
pseudocode: 


IF (Xrot < 0) OR (Xrot >= N) CONTINUE 
IF (Yrot < 0) OR (Xrot >= N) CONTINUE 


Figure 4: Pseudocode for the basic loop structure of the improved rotation 


algorithm. 
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There’s one more problem lurking in 
the basic pseudocode. The calculations 
for the rotated pixel coordinates Xrot 
and Yrot are done in floating-point math, 
yet the map pixel coordinates must be 
integers. We need to be careful about 
rounding these floating-point-to- integer 
conversions. Normally, C would truncate 
the floating-point value—we appear to 
need rounding code inserted into the ro- 
tation algorithm. Adding this code into 
the algorithm’s inner loop will be cost- 
ly in terms of execution time, but it ap- 
pears that this can’t be helped. Each ro- 
tated map pixel (a little square) might 
cover parts of multiple input pixels — 
we need to be careful how we deter- 
mine which output map pixel we choose 
for each input map pixel. We could in- 
troduce distortions into the rotated 
map — or we could generate holes in 
the rotated map. 


Turn the Problem on Its Head 
Sometimes, the best way to attack an op- 
timization problem is to look at the al- 
gorithm from another direction. When I 
had just about conceded that rounding 
code was a necessary evil, a friend sug- 
gested that I look at the rotation prob- 
lem differently. Instead of rotating the in- 
put map into the output map through a 
rotation angle B, try rotating each pixel 
of the output map back to a pixel in the 
input map by the angle —B. This has no 
effect on the derivation of the rotation 
equations mentioned earlier—we don't 
care what the value of B is. Now, the 
aforementioned last line in the Figure 4 
pseudocode reads 


output_mapII][J] = input_map[Xrot][Yrot]. 


At first, this doesn’t appear to be an im- 
provement over the original form. We 
still have to convert the rotated pixel co- 
ordinates from floating point to integers. 
The point to notice is that now every 
output map pixel gets set from some in- 
put map pixel. We can skip the round- 
ing and every output pixel will still be 
set to some input pixel value— no holes. 
In fact, some input pixels might be used 
more than once, depending on the ro- 
tation angle. We can let C do its normal 
truncation and the result will be free of 
holes. 


The Final Rotation 

Listing One is the C code for the weath- 
er map rotation function. As you can 
see, it consists of little more than the 
pseudocode I’ve already discussed. The 
actual weather maps in our system do 
not use the Oth row or column, and an 
additional “border” row and column is 
added to each map. This results in a 
border of zero-value pixels around the 
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oftware developers creating applica- 

tions for use on public and private 

networks confront a number of per- 

formance issues which, to date, have 
restricted the utility of network- delivered 
software. While processing power has in- 
creased several orders of magnitude in the 
past 20 years, bandwidth remains a bot- 
tleneck and, for many users, will contin- 
ue to be an issue for years to come. 

The concept-registry system (as well 
as concept-oriented programming, a 
technique that exploits this system) de- 
scribed here, makes it possible to write 
software that requires far less bandwidth 
to deliver, and thereby to increase ap- 
parent delivery speeds significantly (an 
order of magnitude improvement). It also 
creates a mechanism for disseminating 
reusable code throughout the Internet, 
effectively turning the Net into a repos- 
itory of reusable code that many devel- 
opers can utilize. This system does not 
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introduce any fundamentally new ideas; 
instead it employs a combination of ex- 
isting concepts and methods, including: 


e Numeric codes to represent symbols 
(characters, machine instructions, and 
the like). 


e A distributed database that maps to/from 
a numeric domain to other domains 
(DNS, for example). 

e Semantic networks. 


The concept-registry implementation I 
present here is an accidental outgrowth 
of another project. Concept registry was 
originally intended for use in a multilin- 
gual communication system called “Pic- 
to” (short for “Pictograph”), a markup lan- 
guage that lets users publish simple 
messages that can be rendered into mul- 





tiple languages. The original idea behind 
Picto was to create a chat tool best de- 
scribed as “emoticons on steroids,” where 
each symbol has a distinct meaning. 

Example 1 is a simple Picto message 
that translates to “Hello World” (where 
concept #1 is “Hello” and concept #2 is 
“World”). While not adequate for complex 
messages, this markup language can be 
used to convey simple messages that can 
then be rendered in multiple languages. 
People won't be using it to quote Shake- 
speare, but it works fine for exchanging 
simple messages with predictable gram- 
mar (multilingual chat is one candidate 
application). Real-time applications are es- 
pecially interesting candidates because 
users can adapt to the idiosyncrasies of 
the translation tool (for example, being 
forced to clarify meaning when a word 
has many possible meanings or uses). 

The concept-registry system, when used 
in conjunction with this markup language, 
maps numeric expressions into target lan- 
guages. Numeric concepts are tagged with 
usage parameters that describe how they 
are used in an expression, and how they 
are linked to other concepts in an ex- 
pression. 

To find out more about concept reg- 
istry and to contribute to this open-source 
project, visit http://www.picto.org/. There 
you will find open-source utilities and in- 
formation for use in building back-end 
and client-side implementations of this 
technique. These utilities and the source 
code for a concept-registry server are also 
available electronically from DDJ (see “Re- 
source Center,” page 5). 
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(continued from page 90) 
Concept-Oriented Programming 
Concept-oriented programming is a straight- 
forward extension to object-oriented pro- 
gramming. Its primary contribution is to turn 
wide area networks (WANs) into a facility 
for software development and distribution. 
The most important additions are the 
creation of a global address space that 
uniquely identifies reusable machine in- 
structions, and a global network of reg- 
istry servers that cache these concepts 
for rapid retrieval at run time. There are 
numerous applications for the technique. 
The technique can be used with any pro- 
gramming language or operating system. 
The system described in this article lets 
concepts be defined in many machine- 
language and natural-language domains 
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simultaneously (a capability that also al- 
lows it to be used as a global help file). 

Suppose you write a sorting algorithm 
that you want to share with other pro- 
grammers. You compile this into a DLL 
or some other executable form, so that 
it can be easily referenced by other pro- 
grams at run time. You would register the 
procedure, and would be assigned a 
unique numeric ID for your sorting al- 
gorithm. Say, for the sake of example, 
that your algorithm becomes concept 
“#51221.” No other algorithm will be as- 
signed this number. Other programmers 
could then reference this procedure in their 
programs with a statement such as this hy- 
pothetical example: 


SortedScores = SortClass.SortScores(Score) 
UseConcept(51221) 





What's a Concept? 
The concept-registry system creates a nu- 
meric address space for concepts. In this 
system, concepts are simply numeric 
placeholders for ideas. A concept could 
refer to a reusable machine instruction, 
VRML object, abstract idea, or natural- 
language expression. The concept-registry 
system creates a numeric address space 
for ideas. Each concept is given a unique 
numeric address so that it will not be con- 
fused with other concepts. Just as you re- 
quest an IP address for a new workstation, 
you would request a concept-registry sys- 
tem address for a new idea (whether that 
idea is a machine instruction or natural- 
language expression). Table 1 lists some 
hypothetical concept-registry entries. 
The concept-registry system consists of 
two important components: 


eA numeric address space, which 
uniquely identifies all globally regis- 
tered concepts. 

e Concept-registry servers, which are dis- 
tributed throughout public and private 
networks that process concept resolu- 
tion requests and disseminate transla- 
tion tables throughout the network. 


The concept-registry system is, in a 
sense, like the domain name system 
(DNS), except that it maps numerically 
identified concepts into many language 
domains. What is especially interesting is 
the concept-registry system typically in- 
dexes a concept in multiple languages. In 
Example 2, for instance, the system: 


— Translates concept #51221 into Java 
bytecode. 

<  Concept-registry server replies with 
Java bytecode for this instruction. 


— Translates concept #51221 into 
English. 

<  Concept-registry server replies with 
English description of what the pro- 
cedure does. 


— Translates concept #51221 into 
Spanish. 

< _Concept-registry server replies with 
Spanish description of what the pro- 
cedure does. | 


In this example, the concept-registry serv- 
er is processing requests to translate numer- 
ically identified concepts into either machine 
instructions or natural-language expressions 
(that is, to provide explanation or documen- 
tation for the concept). The concept-registry 
server is not required to understand the in- 
formation it is providing. Like other direc- 
tory servers, it merely maps information 
from one domain into another. 


Concept-Registry System Services 


The concept-registry system provides the 
following basic services: 
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(continued from page 92) 

¢ Concept resolution/translation. Local 
concept-registry servers process re- 
quests to translate a numeric concept 
into a target language. Concepts can be 
translated into machine languages, dis- 
play languages, or natural-language ex- 
pressions. 

¢ Concept distribution/replication. Just as 
the DNS distributes update host tables, 
the concept-registry system will update 
concept registries on a daily basis. 

¢ Conflict resolution. Master concept reg- 
istries ensure that duplicate ID numbers 
are not assigned to concepts, thus en- 
suring that each concept has its own 
unique address. 

e Reverse lookups. Concept-registry servers 
can search for a pattern in their table of 
registered concepts. This is used in mullti- 
lingual applications, specifically to create 
lexicon services and translation aids. 


Creating High-Performance 

Network Software 

One of the greatest practical benefits of 
this system is the ability to reduce the size 





Example 1: Picto implementation of 
concept registry. 





of network- delivered software, therefore 
increasing apparent transmission speeds. 
The system creates, in effect, a smart 
caching system that eliminates the re- 
dundant transmission of instructions, and 
lets users cache large libraries of reusable 
machine instructions in close proximity 
to end users. 

Instead of transmitting the entire pro- 
gram to users, you can send only the 
upper layers of the program, which, in 


turn, reference numerically identified — 


instruction sets that may or may not be 
cached on the end user’s computer. If 
the end user’s computer has encoun- 
tered these concepts before, it will 
fetch the underlying instructions from 
a local cache. If not, it will contact a 
nearby concept-registry server to re- 
quest the underlying instructions. While 
this introduces obvious security issues 
(see http://www.picto.org/), the tech- 
nique lets you realize order-of-magnitude 
improvements in apparent delivery 
speeds. 

I call these programs “origami exe- 
cutables” because they are comparative- 
ly tiny programs consisting of numeric 
pointers to underlying instruction sets 
(which may themselves contain refer- 
ences to other concepts). These programs 
expand into a complete set of instruc- 
tions at run time, thus increasing appar- 
ent transmission speed to users. (While 
this technique will substantially improve 
delivery times, it will not improve exe- 
cution speed.) 


Example 2: Typical concept-registry resolution transaction. + denotes a 
message sent from client to concept-registry server; < denotes a message sent 


from server to client. 





Table 1: Typical concept-registry entries. 
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Example Scenario #1 

Consider, for example, a scenario in 
which a corporate workgroup is running 
applications over a WAN (Figure 1). A 
corporation installs concept-registry 
servers throughout its WAN. The concept- 
registry servers have a 10/100 Mbits/sec 
path to end users, and are constantly 
updated with the latest concepts (much 
as DNS automatically distributes updates 
to DNS servers daily, so too will the 
concept-registry system). 

Users on these networks will receive 
most of the instructions from local 
concept-registry servers that have a 10/100 
Mbits/sec path to users. Since these 
concept-registry servers cache instruc- 
tions used by the entire workgroup, the 
performance improvements are impres- 
sive. Instead of loading applets from a 
central point through a congested WAN 
link, users have an apparent 10/100 
Mbits/sec connection to the server. 

To calculate the performance improve- 
ment, use the formula: 


ACR = TB/TC 


where TJC is the time to deliver code us- 
ing concept-registry technique and 7B 
is the time to deliver code using con- 
ventional technique. Then calculate: 


TB = (UC + PL + CL)/IBW 
TC = (UC/IBW) + (PL/CBW) + (CL/DBW) 


where ACR is the apparent compression 
ratio, JBW is the Internet bandwidth (ef- 
fective throughput from end user to dis- 
tant server), CBW is the bandwidth from 
end user to nearby concept-registry serv- 
er, DBW is the bandwidth to local disk 
drive or LAN-based registry, UC is the 
unique code size in KB (your program 
and its unique libraries), PL is the size 
of publicly registered concepts in KB, 
and CZ is the size of locally cached con- 
cepts in KB. 

The key metric — apparent compres- 
sion ratio — is the perceived bandwidth 
available to load the program. The tech- 
nique easily increases apparent through- 
put several times, and when fully exploit- 
ed can deliver order of magnitude 
improvements. 


Example Scenario #2 

In this scenario, assume a 500-KB program 
contains code in which 475 KB of code 
is stored in the concept-registry system 
and 25 KB is unique to the application. 
Users are on a small LAN with a 128-KB 
connection to the Internet, and an in- 
house concept-registry server that has a 
10 Mbits/sec path to users. The user’s disk 
drive has 100 Mbits/sec of bandwidth. The 
program contains several widely used con- 
cepts, some of which (say 25 percent) the 
user has encountered before. According 
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(continued from page 94) 

to our formula, the apparent compres- 
sion ratio will be 16.86:1, making the 
user’s 128-KB connection look like a 2.15 
Mbits/sec connection. 

To support capability such as this, pro- 
gramming languages need to be extend- 
ed to support concept notation, and to 
create executable code that can be dis- 
tributed independent of the entire pro- 
gram (a mini DLL, in other words). 
Adding support for concept notation to 
a program is not that difficult. When con- 
cept notation is incorporated into lan- 
guages, the compiler merely needs to be 
able to talk to a concept-registry server 
to obtain the machine language “transla- 
tion” for a given concept and merge this 
code into a program, either at compile 
time or run time. The details of how con- 
cept notation is expressed in each lan- 
guage vary. Examples of how this might 
appear include: 


SimpleGrid[8991](x,y).contents=balance; 
SortedScores = SortScores(ClassScores As 
Array) UseConcept(51221) 


Bind SortClass Using 51221, 
Bind HistogramClass Using 78910; 


Concept-Oriented Operating Systems 

Concept-oriented techniques can also 
be used to build operating systems. A 
concept-oriented operating system 
would have some attractive features 
compared to current systems, including: 


¢ Compact design. The OS could be dis- 
tributed as a very small package that 
would then obtain additional OS com- 
ponents from the concept-registry 
system. 





¢ Continual evolution. The OS would 
evolve automatically as new compo- 
nents are registered. This does away 
with the notion of upgrading an OS. 
e Network appliances. Such an OS 
would be highly useful for inexpen- 
sive network appliances. The appli- 
ance would contain only the code 
needed to boot itself, and would 


Concept-oriented 
programming IS a 
straightforward 
extension to 
object-oriented 
programming 





obtain higher level components from 
nearby concept-registry servers, thus 
reducing the cost of maintaining these 
devices. 
Rapid innovation. Automatic dissemi- 
nation of updates to users increases 
the rate at which the OS evolves. An 
open-source OS based on this tech- 
nique would benefit from contribu- 
tions from many sources. 
e Automated replication. Every machine 
running a concept-oriented OS could, 


Figure 1: Concept-registry server directly connected to the user’s LAN/WAN, 
providing an apparent 10/100-Mbits/sec connection to server. 
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in turn, become a concept-registry serv- 
er, providing nearby machines with one 
or many high-speed servers to talk to. 
Each new machine increases the over- 
all processing capacity of the concept- 
registry system as a whole. 


A concept-oriented OS would not 
be fundamentally different from a con- 
ventional OS, except that all of its com- 
ponents would be registered in the 
concept-registry system. The only new 
feature is the use of the global address- 
ing scheme provided by concept-registry 
systems to track and retrieve components 
from the network. 

A hypothetical OS could be delivered 
as a small package that would provide 
basic I/O, logic, and not much else— just 
enough to boot the machine in VGA 
mode and start talking to the network. 
Once launched, the OS would automati- 
cally obtain additional concepts required 
by the OS. This could be done on an as- 
need basis (don’t download the floating- 
point math class library until it is need- 
ed), or on a preemptive basis (download 
concepts in order from most used to least 
used). Through sleight of hand, you could 
build an OS that appears to be small 
enough to fit on a floppy disk, yet is in- 
finitely extensible. 


Additional Applications 
Since the concept-registry system can 
translate numeric concepts into multiple 
machine- and human-language domains, 
the system can be used to store docu- 
mentation for machine instructions. Be- 
cause the system is open, developers in 
many countries could contribute com- 
ments and documentation for publicly 
registered components. Therefore, con- 
cept registry can be used as a globally 
distributed help file for the components 
registered, with developers in many coun- 
tries contributing to the knowledge base. 
Again, concept registry was originally 
developed to support language transla- 
tion aids, such as tools to translate for- 
eign words and phrases. One such ap- 
plication is a web browser plug-in that 
uses concept registry to look up transla- 
tions for highlighted words and phrases. 
Since concept registry can index con- 
cepts in any number of languages, this 
plug-in can serve as a universal transla- 
tion dictionary. 


Conclusion 

The concept-registry system, and pro- 
gramming techniques that leverage it, are 
still in embryonic stages of development. 
Consequently, your criticism and code are 
welcome. 


DDJ 
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ideo for Windows (VFW), intro- 

duced by Microsoft during the 16- 

bit Windows era, was designed to 

let applications interact with video- 
capture cards (also known as “video grab- 
bers” or “frame grabbers”) in a consistent 
manner, allowing applications to display 
and record video from any device. 

I wrote an ActiveX control, oVFW, that 
encapsulates VFW’s features, allowing ap- 
plications — including Visual Basic apps — 
to interact with a video-capture card. In 
this article, I'll describe OVFW, and pre- 
sent a sample Visual Basic application that 
uses this control. 


Video for Windows 


Video for Windows exposes a set of API 
functions that lets an application create a 
VFW window, attach a capture driver to 
it, capture a frame (or a video stream), and 
preview live video. The VFW SDK comes 
standard with most Microsoft development 
environments, such as Visual C++. 

Most of the VFW API functions are ac- 
tually simple macro wrappers around the 
Win32 SendMessage function with prefilled 
parameters. For example, the VFW API 
function capGrabFrameNoStop is actually 
a macro consisting of SendMessage called 
with WM_CAP_GRAB_FRAME_NOSTOP 
as a parameter. 

To interact with VFW, applications must 
first create a window that VFW can con- 
trol. This window can be a child of any 
of the application’s windows, so it appears 
to be fully controlled by the application 
itself. In reality, VFW handles all of the 


Ofer is the technical director of Quality By 
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ming with Visual Basic 5 (McGraw-HilD. 
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WM messages that the window receives. 
To create the VFW window, the capCre- 
ateCaptureWindow API function is called. 
The capDriverConnect API function caus- 
es VFW to bind a video-capture driver to 
a particular VFW window. 

Once a driver is connected to a VFW 
window, the application can make the 
window show live video, question it for 
video- driver support parameters, or have 
it capture a frame. 


Live Video 
Most video-capture cards offer two modes 
for displaying live video — preview and 
overlay. Overlay-enabled video-capture 
cards interface directly with the display 
adapter, transferring video information di- 
rectly to it via the PCI bus (typically using 
a fast DMA transfer) or through a propri- 
etary bus between the video-capture card 
and video- display adapter. Overlay mode 
uses an extremely small amount of re- 
sources from the host machine, because 
the DMA, not the CPU, is transferring the 
video frames. Typically, the capture card 
becomes the master of the PCI bus for a 
short duration and uses a DMA channel 
to transfer a burst of video information di- 
rectly to the video-display adapter’s on- 
board memory. 

In preview mode, VFW continually re- 
quests new frames from the driver and 
paints those frames itself using BitBit or 


StretchBit. In this case, the CPU requests 


the data, the capture card transfers the data 
to the system’s regular memory, after which 
the CPU must transfer it to video memo- 
ry again using BitB/t. Due to the amount 
of data transferred by the CPU, preview 
mode provides much slower frame rates. 
The frame rate drops even more with frame 
sizes larger than CIF (320X240). 


oVFW lets Visual Basic apps use Video for Windows 


Users typically prefer overlay mode, be- 
cause of its low system overhead and high 
frame rates. However, overlay mode usu- 
ally causes some slowdown on part of the 
system bus (slowing down other trans- 
fers because the PCI bus is busy with 
video information), and it is not always 
supported by the combination of video- 
capture cards, PCI bus chipsets, and video- 
display adapters (especially those with 
nonlinear memory). When overlay mode 
is not supported, preview mode is used 
instead. 


Creating the oVFW Control 

I created the oVFW control to give all 
types of applications, particularly Visual 
Basic applications, simple and consistent 
access to VFW. The control gives appli- 
cations access to VFW driver information, 
overlay and preview live video, and sin- 
gle frame capture. I used the ActiveX Tem- 
plate Library (ATL) to create the control. 

ATL is modeled after the C++ Standard 
Template Library (STL), and provides an 
application with full support for COM via 
easy to use template base classes. ATL re- 
quires little or no dependency on exter- 
nal DLLs or libraries, keeping a COM ap- 
plication very simple. 

The ATL framework does not interfere 
with the actual object-oriented design of 
an application. A C++ class can contain 
any number of interfaces via inheritance. 
When creating a simple ActiveX control, 
ATL has little to say about what kind of 
user-interface the control will have. ATL 
is useful for creating all types of COM 
components, some with user-interface (Ac- 
tiveX controls) and some without (Auto- 
mation objects). 

I used Microsoft Developer Studio’s ATL 
COM application wizard to construct a 
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(continued from page 98) 

new ATL workspace. The wizard was set 
to create an ATL DLL, which merges the 
proxy/stub code into the DLL itself. If the 
proxy/stub code were not merged into 
the DLL, ATL.DLL would have been re- 
quired to use oVFW. 

Next, I created a new ATL control, 
which served as the basis for the oVFW 
control. I then added the properties and 
methods for the control, as well as a 
simple automation object, which is used 
for querying VFW drivers for their ca- 
pabilities. 

The control supports both connection 
points and error handling. Support for er- 
ror handling allows the control to emit er- 
rors to hosts that support COM error han- 
dling (Visual Basic apps, for instance). 
Support for connection points allows the 
control to trigger the enumeration event, 
which is described later. 


Enumerating VFW Drivers 

Since VFW limits a system to one win- 
dow per driver, users must install more 
than one video-capture card to view sev- 
eral simultaneous live video feeds. Be- 
cause video-capture cards are becoming 
increasingly popular and inexpensive, 
having multiple video-capture cards in 
a single machine is feasible. My own ma- 
chine, for example, sports a TV card for 
viewing cable television while I program 
(that’s my story and I’m sticking to it), 
and a second card for video conferenc- 
ing and other capture applications. In 
systems with specialized applications, 
multiple video-capture cards can be used 
to enhance the application and provide 


additional functionality (a security ap- 
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plication that would allow the operator 
to view several live images inside a bank, 
for instance). 

VFW supports up to 10 different drivers 
running on a particular machine, and as- 
signs an index to each. For an application 
to select the appropriate available driver, 
the oVFW control gives the hosting ap- 
plication the ability to enumerate the avail- 
able VFW drivers. 

To enumerate VFW drivers, the control 
has a function called DriversEnum, which 
attempts to attach to each driver. Dri- 
versEnum first creates a specialized VFW 
window using capCreateCapture Window. 
Then capDriverConnect is called for each 
of the driver indices. If capDriverConnect 
succeeds, the driver’s properties are re- 
trieved. If a driver is currently in use else- 
where in the system, the control will not 
detect it and will not be able to retrieve 
that driver’s properties. 

Once oVFW retrieves the driver infor- 
mation, it needs to expose it to the host- 
ing application. One way to do this would 
be to store the information in a COM col- 
lection, and expose the collection to the 
application. However, I chose not to use 
a COM collection for two reasons: 


e | wanted the index of the collection to 
correspond to the index of the driver. 
Because the active VFW driver indices 
might not be sequential, the collection 
index would not always equal the driv- 
er index. 

Using a COM collection would require 
caching the driver-information object. I 
wanted the hosting application to be 
able to control what information it want- 
ed to retrieve and cache. 


VY 


Events 


Dr. Dobb’s Journal, June 1999 


You don’t have to be a master mathematician to increase your 
leverage. Using an Object Database can reduce your code by 30%. 


sing a Relational Database Management System 
Ue Object Orientation (OO) programming 
can severely undermine the benefits of OO. The costs 
of mixing Object-to-relational programming are high 
and often negate the benefits of OO program- 
ming, such as flexibility, reuse and 
simplicity. Using an Object Database 


with OO programming can reduce 





your code by 30%. To learn more about how 
Objectivity’s Object Database can help you exploit 
the object-related benefits in your application and 
increase your leverage, contact us for a free copy of our 
white paper, Accelerating Your Object Oriented 
Development by visiting our website: 
www.objectivity.com/lever 
or calling (800) 767-6259. 


© 1999 Objectivity Inc. Objectivity and Objectivity/DB are registered trademarks of Objectivity, Inc. 








| Natural Language Pars 
New! Version 3.0 


















- is the only tool that brings 
rsing technology to all 
ou can design any parser 
saving our sophisticated 
opment environment. 
ure new to parsing 
am experienced veteran, 
is the right tool for you. 





+ provides native pro- 
em for C, Cit, Java, 
and Delphi. Our C++ and 
re fully supported on all 
arset+ also comes 


grammar, including 
rammars. This opens up the 
ural language parsing, 
pplications. Also included 
user interface, with many 
and improvements, along 
tree and machine views. 






Jelphi | VB 


tone Technology 
939 Coast Blvd 

Suite 4C 

La Jolla, CA 92037 
Phone: 3) 9) 454-9404 


: 54-9467 
@sand-stone.com 








(continued from page 100) 

Rather than use a COM collection, 
DriversEnum uses COM connection 
points (events) to inform the hosting ap- 
plication which drivers are available and 
what the properties of those drivers are. 
The hosting application should supply at 
least one event handler for each event 
(otherwise, the event simply dissipates 
when it is fired). When DriversEnum suc- 
cessfully connects to a driver, it sets up 
a driver-information object and triggers 
the enumeration event. When the event 
handler returns, DriversEnum detaches 
from the driver and tries to attach to the 
next driver. 

Listing One shows the event interface 
declared in the project’s IDL file so that 
OVFW can support events. Listing Two 
shows how oVFW informs the world that 
it will trigger the interface, not receive it. 

Figure 1 shows the Developer’s Studio 
ATL proxy generator used to create an ATL 
base class for CoVFW. The base classes 
responsible for allowing CoVFW to trig- 
ger events are [ProvideClassInfo2Impl<>, 
IConnectionPointContainerlmplkCoVFW>, 
and public Cproxy_oVFWEvents<CoVFW>, 
all in Listing Three. 

The DIID__oVFWEvents constant must 
be declared next, as in Listing Four. This 
constant defines the unique class identi- 
fication for the event interface. Also, the 
connection point entry must be declared; 
see Listing Five. 

The output for the ATL proxy genera- 
tor is an automatically created CProxy_ 
oVFWEvents class which provides a trig- 
ger function for the class’s events (Fire_ 






a reminder that new versions of com- 

pilers often require code modifications 
instead of just a simple recompilation. 
The following issues had to be resolved 
to get Visual C++ 6 to produce a work- 
ing control: 


Pires this project to Visual C++ 6 was 


e The Video for Windows macros (found 
in VFW.H), such as capPreviewScale, 
utilize the function JsWindow, assum- 
ing that this function from the Win- 
dows API will be used. Apparently, 
the ATL base classes now include their 
own IsWindow method, which has a 
different parameter list. Since C++ 
compilers first try to match functions 
against class methods, each such 
macro generates a compiler error. I 
bypassed this problem by adding to 
my control class an IsWindow method 
that calls the Windows API function. 


Support for Visual C++ 6 


DriversEnum). The full listing for the 
EnumDrivers function is available elec- 
tronically; see “Resource Center,” page 5. 


The Current VFW Driver 

Driver information can be accessed using 
the CoVFWDriverInfo class. This class is 
an additional ATL class that supports er- 
ror information. The class retains local in- 
dex and WndC variables that let the class 
maintain the identity of the VFW driver to 
which it is attached. The oVFW control 
keeps one copy of the current driver, con- 
structed in CoVFW’s constructor (Listing 
Six) throughout its lifespan. This COM- 
enabled copy of the driver information 
object can be safely passed to the host- 
ing application via a call to the get_Cur- 
Driver, the implementation is available 
electronically. 

The ATL-provided CComObject tem- 
plate class provides COM access and ref- 
erence counting, enabling outside objects 
to access these COM objects. AddRef is 
used to increment the reference counting 
and Release decrements it (and deletes the 
object if no other references to the object 
exist). The CoVFW destructor releases the 
object by performing a Release call; see 
Listing Seven. 

Note that the pDriverInfo object is not 
deleted by the CoVFW destructor, because 
it may still have outside references in the 
hosting application, even after the actual 
control is released. Deleting the object 
would probably cause the hosting applica- 
tion to throw a protection fault the next 
time it attempts to access the object. Instead, 
the window handle is set to NULL, causing 





e Some changes were required in order 
for VCt++t 6 to accept the DHD__ 
oVFWEvents object. These were most- 
ly semantic and location problems 
(that is, the statement had to take place 
before oVFW.h was used). 

Adding events to Visual C++ 6 is much 
simpler than with earlier versions. An 
ATL class (object or control) that spec- 
ifies that it will use events automatically 
attaches an event interface to the ob- 
jects class. Now, the event-interface 
class can be accessed as if it were a 
standard COM interface (methods can 
be attached to it, for example). Final- 
ly, after VC++ compiles the IDL source 
to a binary TLB file, right clicking the 
AIL class and selecting the “Implement 
Connection Point” menu will create the 
proxy class and all of the other re- 


quirements for the class to fire events. 
— O.L. 
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Figure 2: The hosting Visual Basic control. 


(continued from page 102) 
the driver to detect the problem and return 
an error to the host (Listing Eight). 


Viewing Live Video 

The ShowLiveVideo method turns live 
video on or off. When called, ShowLive- 
Video creates the VFW child window and 
attaches it to the selected driver index. 
The hosting application can select a par- 
ticular VFW driver index (set during the 
enumeration process) by setting the con- 
trol property iVideo. In a typical single 
video-capture-card environment, this pa- 
rameter will be zero. 

The AttachVFW function (available elec- 
tronically) determines the size of the client 
rectangle for the control, and calls the 
VFW API function capCreateCapture- 
Window, which creates a VFW window 
as a child of the oVFW control, causing 
the entire visible space of the control to 
contain the video window, rather than 
empty space. The subsequent call to cap- 
DriverConnect lets the VFW window know 
that it should be attached to a particular 
VFW driver. If this function call fails, the 
driver is currently unavailable. 

After attaching to the appropriate VFW 
driver, the StartLiveVideo function (avail- 
able electronically) determines if the ap- 
plication wants an overlay window. If so, 
it determines if the attached driver is ca- 
pable of supporting overlay mode. If not, 
the function simply turns off the overlay 
flag. This flag can be modified or inspected 
by the hosting application to determine if 
overlay mode is being used for the live- 
video display. 

StartLiveVideo calls capPreviewScale, 
which stretches the live video within the 
control’s client area. Next, the function de- 
termines if it should use overlay mode. If 
so, capOverlay is called to initiate live 
video. If preview mode is being used, the 
preview frame rate is calculated (using the 
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frames per second requested in the pre- 
viewRate control property) and commit- 
ted using capPreviewkate. The capPre- 
view function is used to initiate the 
preview mode. At this point, live video is 
being displayed on screen. 

The StopLiveVideo function (available 
electronically) turns off both overlay and 
preview modes, and calls Detach VFW 
(available electronically), which detach- 
es from the VFW driver. This function 
must also be called in the control’s de- 
structor to prevent a situation where live 
video is left open while the VFW hosting 
window has been destroyed. If this were 
to occur, subsequent attachment attempts 
to the selected VFW driver will be un- 
successful, and the video driver will re- 
main inactive until the system restarts. 

Some video-card fF a 






drivers, such as ATI 
TV’s drivers, have 
problems determin- 
ing whether the 
video window has 
moved. This is be- 
cause VFW does not 
monitor window 
movement, so the 
driver itself has to try 
and detect move- 
ment of the window 
containing the live 
video. Since the par- 
ent window, not the 
live-video window, 
has actually moved, 
some drivers may 
not realize that it has 
to move the overlay 
window too. This 
can result in live 
video playing where 
it shouldn’t or not 
playing where it 
should. 


“4 video for windows control sample - Micros 


Figure 3: Sample web page. 


One crude fix for this problem is to have 
the hosting application send WM_MOVE 
messages to the video window whenever it 
moves, subsequently causing the driver to 
pick up the message and place the live video 
in its correct position. 

Unfortunately, this is ineffective in Vi- 
sual Basic apps (no move event is sup- 
plied to Visual Basic forms) where sub- 
classing is difficult and expensive, 
especially for just supplying WM_MOVE 
messages. Because the problem only oc- 
curs in overlay mode, using preview 
mode is an effective but crude solution. 
In preview mode, each frame is sent to 
its correct screen location using GDI 
functions instead of a DMA transfer. 


Capturing Frames 

The control implements two different 
methods of capturing single frames— cap- 
turing frame information to the clipboard 
and capturing to a bitmap file. Visual Ba- 
sic applications can easily access clipboard 
data using Listing Nine. 

To capture to the clipboard, the Cap- 
tureEditCopy method first determines if 
the driver is attached (that is, if live video 
is presently being shown). If so, cap- 
GrabFrameNoStop is used to capture the 
image from the driver without stopping 
the live video. capEditCopy is then used 
to save the captured image onto the clip- 
board. capEditCopy can be used with- 
out first calling capGrabFrameNoStop; 
however, this stops the live video on 
some drivers. 

If the control finds that no driver is at- 
tached, it will quickly attach a driver, 
capture a frame, and detach from the 
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(continued from page 104) 

driver. The API function capGrabFrame 
is the functional equivalent of capGrab- 
FrameNoStop, except that it definitely 
stops live video (both preview and over- 
lay) when it is called. 

Capturing to a bitmap file is similar 
to capturing to the clipboard, except 
capFileSaveDIB is used instead of cap- 
EditCopy. This is done by the Capture- 
ToBitmapFile method (available elec- 
tronically). 


Sample Application 

A Visual Basic application (available elec- 
tronically) exercises all of the features that 
the control has to offer. It provides live 
video depending on the overlay and video 
index parameters. It determines the ca- 


-lLint 


#include "stdio.h" 


void g( int n ) 
{ 
printf ( 
} 


"Sd\n" ~« Oi ) ; 


int main() 
{ 
int 2 = 327i: 
g( n/Il0O ); 
return 0; 
} 


pabilities of the current driver, and enables 
or disables some of the feature buttons 
accordingly. Figure 2 provides a view of 
the hosting Visual Basic control. 

The ActiveX control can also be used 
from Microsoft Internet Explorer (IE), so 
that you can view live video on a web 
page. The most prominent difference 
when dealing with IE as a hosting envi- 
ronment is its stricter security. Internet Ex- 
plorer wants to use controls that are “script 
safe” and “initialization safe.” If controls 
are not marked as such, an annoying mes- 
sage will accompany each web page that 
incorporates the control. 

To fix this, you need to make a sim- 
ple addition to the control’s registry set- 
tings. ATL inserts registry settings by hav- 
ing them placed in special RGS files that 
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are automatically added to the project’s 
resources. The contents of these RGS 
files are merged into the registry when 
the control is registered using a setup 
routine or regsvr32. The ATL wizard cre- 
ates the basic RGS file, which can be 
subsequently modified as needed. This 
makes it very simple to modify the en- 
tries placed in the registry by the con- 
trol during the registration process. 

To have the control marked as script 
and initialization safe, you need to mod- 
ify the RGS file, as in Listing Ten. The 
first long numerical key in this example 
identifies oVFW’s unique COM class 
identifier. The other two keys identify 
the COM class identifier for the sup- 
ported categories: script safe and initial- 
ization safe. 

Listing Eleven is the HTML code that 
lets the control be shown on an Internet 
Explorer web page, and the actual result- 
ing web page in Figure 3. 


The Future of VFW 

Microsoft considers VFW a dead dog, 
and has not changed its fundamental 
structure since Windows 3.1. Windows 
NT contains the first 32-bit implementa- 
tion of VFW, which does not seem to 
improve its performance in any signifi- 
cant manner (in fact, preview mode is 
slower). 

Microsoft’s latest offering for this are- 
na is DirectShow, a complex, COM- 
based, multilayered interface that allows 
enhanced access to TV tuners, filters, ca- 
ble and satellite decoders, and more. Di- 
rectShow, which is included with Win- 
dows 98, is a stunning contrast to VFW. 
The requirements to write a minimal driv- 
er are quite complex. Additionally, most 
driver writers are required to write sev- 
eral layers. Microsoft also treats DirectX 
as a moving target, changing the speci- 
fications frequently. Luckily, DirectShow 
is backwards-compatible and will support 
VFW applications as well. 

Most of the fundamental flaws that 
were found in VFW have been fixed in 
DirectShow, including access to tuner 
information. DirectShow provides clear 
information about window movement 
and live-video regions. However, due to 
the complex framework of COM- object 
layers involved in DirectShow, it will not 
provide a significant performance im- 
provement over Video for Windows. 
Without an improvement in perfor- 
mance, preview mode will not be able 
to support full-motion video, which 
makes the implementation of things such 
as real-time filters for live video simply 
impossible. 


DDJ 
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Listing One 


hidden, 
uuid (54526101-FOCA-11d1-969F-@02018631632) 
] 


dispinterface 


_oVFWEvents 


properties: 
methods: 
[id(1)] void DriversEnum([in]long index, [in] IoVFWDriverInfo* driver); 


}; 


e e 
Listing Two 
coclass oVFW 
{ 
[default] interface IoVFW; 
[default, source] dispinterface _oVFWEvents; 
yi 


Listing Three 


class ATL_NO_VTABLE CoVFW : 
public CComObjectRootEx<CComSingleThreadModel>, 
public CComCoClass<CoVFW, &CLSID_oVFW>, 
public CComControl<CoVFW>, 
public CStockPropImpl<CoVFW, IoVFW, &IID_IoVFW, &LIBID_oVFW>, 
public IProvideClassInfo2Imp1<&CLSID_oVFW, &DIID__oVFWEvents, 
public IPersistStreamInitImpl<CoVFW>, 
public IPersistStorageImpl<CoVFW>, 
public IQuickActivateImp1<CoVFW>, 
public I0leControlImpl<CoVFW>, 
public I0leObjectImp1<CoVFW>, 
public I0leInPlaceActiveObjectImp1<CoVFW>, 
public IViewObjectExImp1<CoVFW>, 
public I0leInPlaceObjectWindowlessImp1<CoVFW>, 
public IDataObjectImp1l<CoVFW>, 
public ISupportErrorInfo, 
public IConnectionPointContainerImpl<CoVFW>, 
public CProxy_oVFWEvents<CoVFW>, 
public ISpecifyPropertyPagesImp1<CoVFW> 


&LIBID_oVFW>, 


{ 


Listing Four 


EXTERN_C const IID DIID__oVFWEvents= 
{ @x545261@1, Oxf@ca, Oxiidl, 
{ @x96, Ox9f, OxO, 0x20, Ox18, 
0x63, O@x16, Ox32 } }; 


e e e 
Listing Five 
BEGIN_CONNECTION_POINT_MAP (CoVFW) 


SSS TT TT LTTE LE III IETS EE ELE LETTE ELSON EEG ESET TEPER SIE SE ITED ESSE EG LSS ESLER SELL ES SSCL EES ESSE SESS SEE ES ALES SLE SESS IS PES ISS EES LES ESSE EOSES ASE OSES IT SLSOI ESSERE, 


CONNECTION_POINT_ENTRY (DIID__ 
END_CONNECTION_POINT_MAP () 


oVFWEvents) 


e e e 
Listing Six 
pDriverInfo= new CComObject<CoVFWDriverInfo> ; 
dynamic_cast<IoVFWDriverInfo*> (pDriverInfo) ->AddRef () ; 


Listing Seven 


dynamic_cast<IoVFWDriverInfo*> (pDriverInfo) ->Release() ; 


Listing Eight 
return Error ( 


_T("oVFWDriverInfo: bad driver information object"), 
IID_IoVFWDriverInfo, CUSTOM_CTL_SCODE(10@9) ) ; 


Listing Nine 


Set Picture1.Picture = Clipboard.GetData(vbCFDIB) 


e ® 
Listing Ten 
..RGS stuff here... 
ForceRemove {5A5FFDB1-F@A7-11D1-969F-@@2018631632} = s 
{ 
..additional RGS stuff here... 
ForceRemove 'Implemented Categories' 


{ 


‘oVFW control' 


ForceRemove '{7DD95801-9882-11CF-9FA9-QGAA9G6C42C4} ' 
ForceRemove '{7DD95802-9882-11CF-9FA9-@QAAQO6C42C4} ' 
} 

} 


e e 
Listing Eleven 
<script language="VBScript"> 
<!-- 
Sub window_onload() 
viw.ShowLiveVideo true 
end sub 
--> 
</script> 


<p><object id="vfw" name="vfw" 
classid="clsid:5A5FFDB1-FQA7-11D1-969F-902018631632" 
align="baseline" border="0" width="320" height="240" h="249" 


w="320">0fer LaOr - VFW</object> </p> 
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Dummies, Failures, 
and the Rest of Us 


Michael Swaine 


hile working on the second edi- 
tion of Fire in the Valley, Paul 
Freiberger’s and my 1984 book 
on the making of the personal 
computer, I realized that Chapter 9 would 
have to be split into two chapters. That 
would mean that Chapter 10, the last chap- 
ter, would now be Chapter 11. It struck 
me as a nice structural comment on the 
subjects of our book. Chapter 11 is 
bankruptcy, which was indeed the last 
chapter for many of the early computer 
and software companies we wrote about. 

Failure, for all its drawbacks, is not 
unenlightening. You can learn a lot from 
failure, and do it painlessly if it’s some- 
body else’s failure. You can also learn a 
lot from novices— often self described 
as dummies. It’s epigrammatic that teach- 
ers learn through teaching. In helping 
those who supposedly know less, we of- 
ten learn that we don’t know what we 
thought we knew as thoroughly as we 
thought we knew it. This month’s col- 
umn features dummies and failures and, 
I hope, some learning. 





Linux for Dummies 

Although I don’t expect DDJ readers to buy 
Dummies books, we do get asked to rec- 
ommend books for users, don’t we? So it’s 
good to know what’s out there. Linux for 
Dummies has been out there for a while, 
but it’s now in its second edition. It is writ- 
ten by Jon “maddog” Hall, executive di- 
rector of Linux International and Compaq’s 
chief Linux guy, and is published by IDG 
Books, of course, the Dummies people 
(copyright 1999; ISBN 0-7645-0421-5). 

It's a book on Linux explicitly for the 
user, not for the system administrator. That 
means that it skips certain topics covered 
as a matter of course in almost every oth- 
er Linux book, such as network adminis- 
tration. 


Michael is editor-at-large for DDJ. He can 
be contacted at mswaine@swaine.com. 
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On the other hand it is very good on 
topics like connecting to the Internet via 
a serial modem and an ISP. Linux for 
Dummies provides what the naive user 
with some Windows experience needs to 
set up and start using Linux in a single- 
user installation with no system adminis- 
trator around. It’s very readable, tells no 
harmful lies, and comes with the public 
parts of the 5.2 Red Hat release. 

But the book does beg the question, how 
appropriate is it for this naive user with 
some Windows experience in a single-user 
installation and no system administrator 
around to be using Linux? IDG Books has 
sold a lot of copies of this book, and I sus- 
pect that some of the people who bought 
it had just heard so much about Linux that 
they thought they ought to try it out, much 
as they might decide to try out a new piece 
of application software. After all, they get 
the book and the software for around twen- 
ty bucks, making it, by computer-store- 
shopping standards, in the price range of 
an impulse buy. 

Installing a new operating system 
should not be done on an impulse. And 
despite some impressive effort to move 
Linux mainstream, this is not— not yet at 
least— an operating system for everyone. 
It's probably not an operating system for 
the impulsive, naive user I have described 
here as a likely buyer of the book. 

Assuming that Linux even makes sense 
as a mainstream operating system com- 
peting with Windows and MacOS, until it 
has a few more ease-of-use boxes 
checked, it is not smart for proponents to 
push it on people who will be disap- 
pointed and badmouth it. 

And I am pointing the finger at whom, 
exactly? Me and my ilk, I suppose. Not 
OEMs, certainly, who are presenting Lin- 
ux only as an option. Not developers, 
particularly. No, it’s us trend watchers. 
But don’t be too smug; you may be seen 
as a trend watcher within your organi- 
zation or amongst the people who ask 





you to recommend books on technical 
topics. Unless we want to produce a Lin- 
ux backlash, we should maybe be a lit- 
tle more careful about how we beat the 
drum for Linux. That said, I’m about to 
do it again. 


Linux for the Rest of Us 

About Linux itself I have no such reser- 
vations. I now maintain four operating 
systems on the various machines in the 
offices here, not counting version differ- 
ences. Although the relative amount of 
work done on the four varies too widely 
to make comparisons meaningful, I can 
report that Linux is the only OS of the four 
that has never crashed for me. So I had 
no axe to grind when I went to the Linux- 
World Expo, held in San Jose, California, 
in March. 

The first hint that Linux was stirring a 
lot of people’s interest was the full park- 
ing lot on Almaden. I parked around the 
corner on Woz Way, across from the Chil- 
dren’s Discovery Museum, and hiked over 
to the Convention Center. 

There, I got the second hint that things 
were really happening in Linuxville. Step- 
ping through the door of the exhibits hall 
was like walking onto the show floor at 
Comdex. All right, there was some differ- 
ence in scale. But the airspace above the 
crowd was filled with the signage of big 
companies — Compaq, IBM, Sun, Hewlett- 
Packard, Oracle, Sybase, Pick, Computer 
Associates, SCO. 

Third clue: Below the signs it was wall- 
to-wall people. I like to try to read the 
crowd at shows, and this crowd was up- 
beat. My informal crowd temperature 
reading, based as usual on lunchtime 
chats, impromptu interviews, overheard 
conversations, and body language, says 
that the people at LinuxWorld Expo 99 
felt that where they were was Where It 
Was At. 

I’m writing this after the show, and the 
announcements before, at, and immediate- 
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ly after the show are blurring together for 
me, but all of the following happened 
close to the time of the show: 


e Intel and Cyrix announced plans to op- 
timize Linux for the Pentium II and III. 


e Several big companies invested in Red 
Hat Software, a Linux company. All the 
major database-software companies 
announced plans to support Linux or 
actually demonstrated software sup- 
port at the show. The GNOME open- 


Paradigms Past: Why Did Babbage Fail? 


Charles Babbage knows two things 

about him: A century and a half ago, 
he invented the digital computer. And 
he never succeeded in building it. 

He is remembered both as a brilliant 
inventor and as a failure. 

The generally accepted explanation 
for his failure is that he was limited by 
the technology of the time. Nineteenth- 
century machine tools, so the story goes, 
simply could not produce parts to the 
tolerances required by Babbage’s de- 
signs for his Difference Engine and An- 
alytical Engine. 

That explanation, it turns out, is sim- 


A cise who recognizes the name 


ply not true. This fact was spectacularly 


demonstrated a few years ago when 
Doron Swade led a team of engineers, 
specialists in Victorian-era metallurgy, and 
others, to build one of Babbage’s ma- 
chines to his specifications using only con- 
struction methods and materials available 
to Babbage in his time. Swade and his 
team got the Difference Engine II put to- 
gether and tested in time for the Babbage 
Sesquicentenary Celebration at the Sci- 
ence Museum in London, where Swade 
is Senior Curator of Computing. When 
they turned it on and started doing cal- 
culations with it, they completed a pro- 
cess that proved conclusively that Bab- 
bage could have built his Engines using 
the technology of his time. So, if it wasn’t 
the limitations of 19th-century technolo- 
gy, why did Babbage fail? 

Swade addressed that question re- 
cently in a lecture that I managed to 
catch at Stanford University. The short 
answer is, because of who Babbage was. 

Babbage was a radical who never let 
up challenging authority. Although he 
founded the Analytical Society and was 
later named Lucasian professor of math- 
ematics at Cambridge, he got thrown out 
of Cambridge as a student for writing a 
thesis judged to be blasphemous. The 
church held great sway over higher 
learning at the time, and Babbage open- 
ly challenged its power. But it wasn’t just 
the church he challenged. “Most of his 
writings were written to protest rather 
than to convince,” Swade said. He reg- 
ularly attacked his peers and the aca- 
demic institutions of the day. One ex- 
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ample: In 1850, Babbage published a 
savage attack on the Royal Society at the 
time when it was funding work on his 
engines. “There is a cost to principles,” 
Swade said, “particularly if you’re polit- 
ically inept.” 

Some of the attacks got very personal, 
and Babbage definitely had enemies. He 
earned them, and some had influence 
over those funding his work. There were 
challenges to his appeals for money and 
there were outright vendettas against him. 

Nevertheless, he got funding under 
conditions that seem outrageous today. 
He submitted no budget for the project, 
was subject to no cost ceiling or bench- 
marks for success. Lenders wouldn’t be 
so free with their money today. 

Some sort of constraints might have 
been useful, though. When Babbage had 
spent all the money he was able to get, 
he had only a fraction of the machine parts 
needed to build the first machine. So he 
instructed his chief engineer to assemble 
what he could from them so he would 
have something to show investors, and 
this fragment is the only piece of Bab- 
bage’s engines ever built by his operation. 

That same engineer was involved in his 
own dispute with Babbage, this time over 
money, which got in the way of com- 
pleting the machines. Babbage didn’t 
alienate just those in power; he made en- 
emies of employees and contractors, too. 

Then there was the vision thing: Al- 
though Babbage had his Ada Lovelace 
to explain and popularize his work, and 
although she did a good job, he himself 
was a lousy communicator of his ideas 
(to paraphrase Swade). And Ada, ac- 
cording to Swade, didn’t really under- 
stand Babbage’s work all that well. I 
know people who disagree with him on 
that point, but Swade. contends that 
Ada’s great contribution to the under- 
standing of the potential of the com- 
puter, her generalizing beyond the mere 
calculation that was all Babbage himself 
ever talked about, was due to her mis- 
understanding of Babbage. That, and 
being a romantic. Ada scholars may leap 
to her defense by e-mailing me at 
mswaine@swaine.com. 


— M.S. 


source GUI for Linux was demon- 
strated at the show. So was the Linux 
version of the Opera browser and 
GIMP open-source image-manipulation 
software. 

e Ditto Twine, a set of tools for porting 
Windows apps to Linux. Corel was mak- 
ing a big case for Linux for mainstream 
office use at the show. 

e IBM and CA had major Linux an- 
nouncements. 

¢ The Java2 development kit was avail- 
able for Linux, Sun told us. 

e A cluster of Pentium Is running Linux 
beat a Cray at number-crunching. 

e Japanese companies were coming 
aboard the Linux bandwagon, as was 
the German software giant SAP. 

e And Larry Wall explained why Perl was 
the first postmodern computer lan- 


guage. 


In short, Linux was catching on in a big 
way, the show was a success, and a love- 
ly time was had by all. 


Has HP Lost Its Way? 

Outside a bagelry on the second morn- 
ing of the Expo, a headline in a newspa- 
perbox caught my eye and I had to in- 
terrupt my quest for food to dig for 
quarters. Over coffee and bagel, I read 
the story of the breakup of Hewlett- 
Packard. 

HP was becoming two companies, not 
to be named Hewlett and Packard but 
rather Hewlett-Packard and Something Yet 
To Be Determined. The former would 
keep all the computer and related (like 
printer) business and the other would get 
everything else. 

Well, all right. It sounded like a logical 
enough division; seemed like it would al- 
low the computer company to focus more 
clearly; might let different cultures devel- 
op in the two companies, reflecting the 
different paces of the personal computer 
and instrumentation markets. HP was 
putting as positive a spin on it as possi- 
ble, of course. But it was clear that this 
move was a serious response to a serious 
problem. HP management sees the com- 
pany as out of touch— not sufficiently re- 
sponsive to changes in the market. The 
personal computer market, that is. How 
seriously the problem is being seen is re- 
flected in the fact that Lewis Platt, the cur- 
rent CEO, won't be staying with either 
company. 

It was with a pensive chomp that I pol- 
ished off the bagel. It’s sad to see one of 
the companies that defined What Silicon 
Valley Is in such distress. HP’s culture of 
respect for the individual set a standard 
that many other Silicon Valley compa- 
nies, perhaps most ostentatiously Apple, 
tried with varying degrees of success to 
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emulate. I hope HP doesn’t jettison the 
things that made it great. 

Of course, the history of the companies 
that defined What Silicon Valley Is is not 
one of boundless success. Fairchild Semi- 
conductor is a big part of the history of 
the Valley, but it is remembered today 
mostly for the people who quit and went 
on to start their own companies, like In- 
tel. And then there’s Xerox PARC. 


Lightning Didn’t Strike 

The founder of Federal Express got a C 
on the college paper he wrote describ- 
ing the idea for his company. FedEx and 
its competitors are doing pretty well now. 
I wonder, as the UPS truck pulls up out- 
side Stately Swaine Manor, if all those 
stats on the growth of e-commerce in- 
clude the increased business for FedEx 
and UPS from the likes of Amazon.com. 
My latest delivery from Amazon.com is 
Dealers in Lightning: Xerox PARC and 
ithe Dawn of the Computer Age, by 
Michael Hiltzig (HarperCollins 1999, ISBN 
0-88730-891-0). 

Although I don’t agree with Hiltzig that 
the Alto was the world’s first personal 
computer, that’s just a matter of different 
definitions — his strictly technological, 
mine involving price and marketing as 
well. I have a few other quibbles with the 
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book, but, overall, I found it highly read- 
able and seemingly authoritative. In writ- 
ing the book, Hiltzig drew on the recol- 
lections of those who were there, 
interviewing all the obvious suspects and 
not a few innocent bystanders. 

The book is worth reading just to re- 
mind yourself of the amazing invention 
machine PARC was— and of the amazing 
collection of inventors who were there. 

The development of the Alto, of course, 
but also: 


e Jim Clark (the cofounder of Netscape) 
designing the Geometry Engine as part 
of a PARC-supervised course at Stanford 
and launching Silicon Graphics on the 
strength of it. 

e Lynn Conway (of Mead and Conway, 
the most well-known names in VLSI) 
developing the design techniques and 
tools to make VLSI a practical reality. 

e An offhand remark to Gary Starkweather 
leading to the invention of the laser 
printer. 

e Bob Metcalfe sifting through various 
networking options and coming up with 
Ethernet. 

e Alan Kay telling Dan Ingalls and Ted 
Kaehler that the most powerful pro- 
gramming language in the world could 
be specified in one page and, when 
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challenged to put up or shut up, in- 
venting Smalltalk. 

¢ Bob Taylor recruiting Bill English away 
from Doug Engelbart and getting access 
to the mouse and all the other goodies 
of Engelbart’s lab. 

e Charles Simonyi inventing WYSIWYG. 

e Dan Ingalls shocking the crowd when 
he demonstrated bitblt. 

e John Warnock and Chuck Jeschke cre- 
ating page-description languages. 

e Alvy Ray Smith coming up with the HSV 
transformation. 


Hiltzig describes PARC’s origins, the re- 
cruitment of talent, its culture, people, pol- 
itics, and projects. He also spends a chap- 
ter on the question, “Did Xerox blow it?” 
That strikes me as overkill for a question 
that can be answered in a word— Duh! 

But I don’t mean to belittle Hiltzig’s ana- 
lysis of the politics of PARC. He does an 
impressive job of telling not only what 
happened, but why and how it happened, 
and how Xerox management both hin- 
dered and empowered this amazing band 
of inventors. 

If this is failure, we should all be so un- 
successful. 


DDJ 
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Covering the Basses 


Al Stevens 


any of you know that, along with 
programming, music is my pas- 
sion. Acoustic music and acoustic 
jazz music, in particular, have al- 
ways been a large part of my life. I pre- 
fer acoustic over electronic music for many 
reasons, one being that when I was a 
youthful budding jazz piano player, elec- 
tronic music was restricted mostly to am- 
plified guitars. My roots aside, acoustic 
music contains physical elements that 
reach into the souls of most players. We 
resonate to the variations and harmonics 
produced by acoustic instruments when 
played by human players. Such sounds 
come from natural components that strike 
each other and vibrate in response to the 
human touch—wooden reeds, gut and 
steel strings, skin drum heads, felt ham- 
mers, brass bells— and random and ac- 
cidental variations involving lip pucker 
and vibrations, air fluctuations, spit, the 
dynamics of arms, hands and fingers, feet 
pressing mechanical pedals, and so on. 
It's a human thing. 

Computers can only approximate some 
of these things. Computer music might 
someday replace acoustic music made by 
human beings, but that time is far away, 
waiting either for Holodecks to be per- 
fected or for a generation of listeners to 
arrive who are too busy and too unin- 
volved with their ambient surroundings to 
want to relate to them— a generation that 
likes artificial fireplaces, plastic house- 
plants, and disposable appliances. Well, 
maybe that’s not all that far away, but in 
the meantime... 

Electronic music as music does not 
move me, but the technology that pro- 
duces it fascinates me as a programmer. 





Al is a DDJ contributing editor. He can be 
contacted at astevens@ddj.com. 
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MIDI, the Musical Instrument Digital In- 
terface, combines music and technology 
to solve certain problems, ones that are 
not usually among the concerns of acous- 
tic musicians. MIDI describes a format of 
data packets that tell electronic instruments 
to play specific notes at specific times dur- 
ing a performance. You can make a lot of 
interesting sounds and music with MIDI, 
but they do not inspire or interest many 
jazz players. 

One area where MIDI really excels, 
though, is its ability to reproduce a hu- 
man player’s acoustic piano performance. 
A piano is a unique percussive and ex- 
pressive acoustic instrument. Its dynam- 
ics are a function of key combinations and 
attack. You can’t bend notes like you can 
on a brass or reed instrument. Whereas a 
horn responds to combinations of lips, 
breath, saliva, and tongue, a piano knows 
only three things—what key did you 
press, how hard did you hit it, and how 
long did you hold the key down. (One 
additional variation occurs when the sus- 
tain pedal is depressed such that the oth- 
er strings vibrate with sympathetic har- 
monics to the strings being struck.) 
Contemporary MIDI playback devices can 
approximate most of this with such ac- 
curacy that many listeners cannot tell 
whether a song is being played by a live 
pianist at a real piano or by a sequencer 
and tone generator that produces the sam- 
pled sounds of a real piano playing a MIDI 
sequence as recorded by a live player. You 
don’t get that kind of sound from the gar- 
den variety SoundBlaster, but high-end 
professional sound cards typically include 
very good acoustic piano samples. 

I am a saloon piano player, and I play 
acoustic pianos whenever possible. MIDI 
for me is a medium for acquiring or build- 
ing tools that assist practice. (An earlier 





column project, MidiFitz, was one such 
tool.) I can record a piano performance, 
and the sequencer lays down each track 
as a series of event data packets rather 
than as a stream of audio waveform sam- 
ples. When I play the sequence back, it 
sounds like me, mistakes and all. If I 
missed a note or messed up a passage, I 
can repair the problem by using the edit- 
ing features of a sequencer program. I can 
also play the hard passages slowly enough 
to get through the piece and then increase 
the tempo for playback. Sure, those tricks 
are cheating, but they’re better than 
recording a difficult song over and over 
until someday I get it right. 

Years ago I played the trumpet, trom- 
bone, and double bass in addition to the 
piano. I retired those instruments in the 
1980s mainly to concentrate on the piano 
but also because I rarely found pianists 
who played what I wanted to hear. Re- 
cently, I have taken up the double bass 
again, but because time and neglect erode 
ability, calluses turn soft, and muscle mem- 
ory fades, I need to woodshed (practice) 
to get back into shape and build up my 
chops. The string bass as a jazz instrument 
is best practiced in the context in which 
it is played—with a group or at least with 
a piano player. I need a piano player 
whose style and harmonic conceptions 
match mine. Not wanting to find and hire 
someone, and being such a piano player 
myself, I decided to record a series of 
MIDI piano renditions so that I can play 
with myself (musically, of course). If I 
don’t like the piano player it’s my own 
fault. These sequences have become my 
primary bass practice regimen. MIDI plays 
the piano and I plunk along on the bass, 
getting better, I hope, with each session. 
Perhaps later I'll add audio bass tracks, a 
few comping choruses, and use the 


115 





recordings to practice playing the horns. 
Then, who knows, perhaps an album of 
music on which I play all the parts. The 
Al Stevens Quartet. Grammy, here I come. 

Why not use audio tapes, you might 
ask? I want the flexibility of fast song se- 
lection and key signature and tempo chang- 
ing that tape does not provide. How about 
WAV files? Still no tempo or key changing 
(without long processing times) and too 
much hard disk space. I have about 300 
songs recorded with which I practice and 
as many more on the list to record. MIDI 
packets are more economical than WAV 
files, and they allow me to change all kinds 
of things without rerecording the piece. 
Wonderful stuff, this MIDI. 


MIDI Software 

Now, about the software. (Good, you say, 
you were wondering when this “C Pro- 
gramming” column was going to get 
around to discussing software.) An ad- 
vantage of being a programmer is that 
when available software will not do, you 
can write your own. There is a lot of MIDI 
software, but I never found a program that 
does what I need. Sequencer programs 
load and play one file at a time. A dou- 
ble bass is an awkward instrument. The 
computer controls are too far away when 
you are standing with your arm wrapped 
around that big bass fiddle, so even one- 
handed computer operation is inconve- 
nient and error-prone. You must put the 
bass aside and sit at the computer to use 
the mouse and keyboard to load the next 
file. Takes too much time. Clearly I need 
a jukebox program to play a set of songs 
from a list. 

MIDI jukebox programs abound, but 
most of them have three shortcomings. 
First, they go immediately from one song 
to another, not giving me time to look at 
what’s coming and rest my weary hands. 
Second, they have no count-off. A bass 
player needs to know how fast the piano 
player is going to play the song. On the 
bandstand, someone usually says, “One, 
two, three, four.” Third, they have no 
metronome. Half of a bass player's job is 
keeping time. (The other half is playing 
correct notes, of course.) A metronome is 
a good training device for learning to keep 
accurate time. 


Jukebox 

To solve those problems, I developed 
Jukebox, the project for this month. Juke- 
box maintains a list of Standard MIDI For- 
mat (SMF) files in a dialog-based MFC ap- 
plication. As you select a song from the 
list of titles, the dialog displays the song’s 
tempo and key and time signatures. The 
program lets you organize the list, modi- 
fy the tempo for each song, specify a num- 
ber of seconds to wait between songs, 
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toggle a metronome, and say how many 
measures of count-off to play at the be- 
ginning of each song. You can start play- 
ing anywhere in the list. Jukebox re- 
members the last song you played in each 
session, selecting the next song for the 
next session. Figure 1 is the Jukebox ap- 
plication dialog. The right pointing arrow 
icon is on the Play button. The up and 


An advantage of 
being a programmer 
is that when 
available software 
will not do, you can 
write your own 





down arrow buttons let you move the se- 
lected song up and down in the list. You 
can download the Jukebox source code; 
see “Resource Center,” page 5. I'll discuss 
the more interesting aspects of the pro- 
gram here. 


MIDIFile, MIDIInfo, and MIDIPlayer 

In May of last year I wrote about the MIDI- 
File class library, which supports reading 
and writing SMF files. To read SMF files, 
a program derives a class from MIDIFile 
and overrides the member functions that 
process the MIDI events the program 
wants to process. 

I modified MIDIFile for Jukebox to sup- 
port a feature it did not have. Jukebox 
builds its list of files by reading the first 
track from the SMF file where it finds the 
MIDI events that describe the song title, 
tempo, and key signature. MIDIFile scans 
an entire SMF file looking for selected 
events in all the tracks. It had no mecha- 
nism to tell it to interrupt the scan and 
close the file. Jukebox took too long to 
build the list of song titles because MIDI- 
File scans the entire file for each song, yet 
the information for the list is at the be- 
ginning of each file in track 1. I added a 
StopReading member function to the MIDI- 
File class. When Jukebox’s derived MIDI- 
Info class sees a start track event that is 
not for track 1, it calls Stopkeading, which 
stops the SMF file scan. 

Time out. Isn’t a pure object-oriented 
programmer supposed to use inheritance 


to change the behavior of an existing class? 
I suppose so, but this change adds a fea- 
ture to an existing class, and the addition 
of the feature has no impact on programs 
that are compiled with the previous ver- 
sion. You’d have to recompile them if you 
wanted to use a common header file for 
all applications, old and new, but that’s all. 

To read an SMF file for playback, Juke- 
box derives the MiDIPlayer class from MIDI- 
File and intercepts the real-time events that 
control playback of the sequence. 

Time out again. Isn’t inheritance sup- 
posed to reflect an IS-A relationship be- 
tween derived and base classes? MIDIFile 
is an abstract bass class that reads SMF 
files. MIDIInfo is a class that gathers in- 
formation about the song in an SMF file. 
MID IPlayer is a class that plays back the 
contents of an SMF file through the com- 
puter’s MIDI system. Can the two derived 
classes really be considered specializations 
of the base class? Is this really an IS-A re- 
lationship? Probably not, but MIDIFile is 
designed so that the derived class over- 
rides functions to intercept MIDI events 
in the file. M/DIFile’s purpose is to be the 
file reading and event parsing engine that 
a derived class uses to select only those 
events it wants to process. It hides the de- 
tails of those operations from its descen- 
dent classes. The inheritance mechanism 
is particularly good for expressing this 
kind of abstraction because overridden 
virtual functions in the derived class in- 
tercept the events the application cares 
about and the absence of overridden func- 
tions bypasses events that the application 
does not care about. M/DIInfo is uncon- 
cerned about the real-time events. It needs 
to get the song title, tempo, and time and 
key signatures to display them in the ap- 
plication dialog window. MIDIPlayer is 
concerned mainly about playback events 
(although the tempo event is one concern 
the two derived classes share). 

The lesson learned here is that purist 
programming dogma— even the object- 
oriented agenda— is not always the only 
solution and does not always deliver the 
best implementation. 


The Sequencer 

MIDIPlayer is a miniature sequencer, 
which is a program that plays sequences 
through a MIDI system. Listings One and 
Two are midiplayer.h and midiplayer.cpp, 
the source code files that implement the 
sequencer. MIDIPlayer loads the real-time 
playback events into one std::vector per 
track and plays the events back. The 
events that Jukebox needs are Note On, 
Note Off, Controller Change, and Program 
Change. The functions of the first two 
event types are obvious from their names. 
Controller Change tells the device associ- 
ated with the channel to change some 
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kind of controller. On a piano, this con- 
troller is usually the sustain pedal. Pro- 
gram Change tells the device to use a dif- 
ferent “patch,” which is General MIDI 
jargon for the instrument sound selected 
from among 128 different instruments. 

An SMF file does not store track events 
together in one merged stream. Instead, 
the file contains all the events for one 
track, followed by all the events for the 
next track, and so on. Each event has a 
delta time stamp specifying how much 
time to wait during playback since the 
previous event for that track before acti- 
vating the current event. MIDIPlayer col- 
lects all those events into memory vectors 
when the program calls MIDIFile::Read- 
MIDIFile(). When the program calls 
MID!IPlayer::Play(), the function sets a 
real-time timer that ticks once every mil- 
lisecond. The timer calls the MIDIPlay- 
er:: TimingMessage function. This is where 
the sequencing is actually performed. 

At each tick of the timer, the program 
checks each track vector to see if an event 
in the vector is due to be activated. An- 
other vector stores offsets into the track 
vectors to indicate which is the next un- 
activated event. 

The granularity of the timer presents 
an unusual problem. Win32’s real-time 
timer has a resolution of no better than 
one tick per millisecond. Yet the delta 
time ticks in some SMF files can specify 
a much higher resolution. Irrespective of 
the tempo of a song or the 32nd note 
resolution of an arrangement, the player 
can press and release notes at any time 
at all. It’s called interpretation. A program 
has to process these events such that the 
tempo of the song and the proximity of 
the notes played is as close to the origi- 
nal rendition as possible. To understand 
this problem, download and play a se- 
quence of something like “Rhapsody In 
Blue,” wherein the MIDI programmer en- 
tered notes exactly as Gershwin (and 
Grofe) wrote them down. The sequence 
sounds wooden and mechanical because 
every 16th note is exactly a 16th note and 
so on. Now listen to a sequence of the 
same composition wherein a performer 
played the composition on a MIDI key- 


aw MidiFitz JukeBox 





ox application dialog. 


Figure 1: Jukeb 
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board into a sequencer that recorded the 
performance in real time. If the instru- 
ment tone generation is any good, the 
playback sounds real. 

How do you fire events that might 
come at you with delta times that repre- 
sent a finer resolution than the timer can 
handle? You can’t, but you can get a close 
approximation by compromising accura- 
cy. For the solution I went to Maximum 
MIDI, by Paul Messick (Manning Publi- 
cations, 1998, ISBN 1-884777-44-9), where- 
in Paul explains how to use two integers 
to simulate floating-point precision for 
the conversion of the tempo and delta 
time into a number that determines 
whether it’s time to activate an event. This 
book is a valuable resource for anyone 
who wants to write MIDI software that 





runs under Windows. It will save you a 
lot of research and experimentation. 


MIDI Mapper 

Most sequencer programs include dialogs 
that let you specify which MIDI channels 
go to which MIDI devices. A real-time MIDI 
event is directed to a channel (not to be 
confused with a track, the two of which are 
often confused by MIDI novitiates). Com- 
puters can have more than one MIDI out- 
put device. Every contemporary Sound- 
Blaster and most other sound cards include 
an internal synthesizer and an external MIDI 
OUT jack. You might want to play the 
drums on your sound card and the piano 
through the samples on your electronic key- 
board. Jukebox does not include such a se- 
lection because Windows 95/98 already 
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have one built in. The Control Panel’s Mul- 
timedia applet allows you to specify cus- 
tom configurations of channel assignments. 
An application can direct its real-time MIDI 
event output to the MIDI Mapper device, 
which uses these assignments. 


The SMF File List 

Jukebox maintains a list of songs that it 
will play. It maintains that list in the sys- 
tem Registry. I thought about using a 
database, considered the additional code 


it would take, and decided not to do it. 
My practice regimen uses the same set of 
songs mostly and Jukebox now works 
well for me. Iam now happily plunking 
and bowing away whilst my favorite pi- 
ano player tinkles. If you use Jukebox 
and need more than one persistent list 
of tunes, I suggest that you add Key val- 
ues to the Registry to represent them. But, 
whatever you do, please don’t send me 
e-mail chastising me because my pro- 
gram is deficient. I add this request be- 


cause that happens a lot. Many readers 
of my books have written, “Dear Al: Why 
didn’t you write your programs the way 
I would have written them to make them 
more usable to the vast majority of users?” 
To them I say: “You are programmers. 
The source code is yours. Modify it.” Of 
course, readers of this column never 
make such comments, I am happy to say. 
Doorknob. 


DDJ 





e e 
Listing One 

[f ----- midiplayer.h 
#ifndef MIDIPLAYER_H 
#define MIDIPLAYER_H 
#include "stdafx.h" 
#include "midiinfo.h" 


/ ---- realtime midi event data 
struct MIDIEvent { 
Long delta; 


Short eventno; 

Short channel; 

Short param1; 

Short param2; 
} . 


pf esses the ticking clock variables 
long clock, period, nticks, fticks, trtime; 


Long delta; 


std::vector<MIDIData> track; 
// --- vector of tracks 


// running delta time accumulation 
// track vector of events being gathered 


std::vector<std::vector<MIDIData> > tracks; 
/ --- vector of track event offsets 


std::vector<int> trkndx; 


HMIDIOUT hMidiOut; 
UINT timer; 
TIMECAPS tc; 
long tempo; 


// microseconds per quarter note 


bool ticking; // semaphore to wait for timing message function to complete 
int countoff; // number of measures to count off 


bool metronome; 


/ true for metronome during playback 


// ---- midi event data ready for sequencing int divetr; // counts for metronome ticks 

struct MIDIData { int beatspermeasure; // number of beats per measure (3, 4, ...) 
Long delta; // delta time from beginning of sequence / --- overridden MIDIFile class functions 
DWORD data; // midi event packet void Header(Short fmt,Short trks,Short div) 
// --- these are to let the type be contained in a std::vector { division = div; 


bool operator<(const MIDIData&) const 
{ return true; } 

bool operator==(const MIDIData&) const 
{ return true; } 


|| ------- class for sequencing a Standard MIDI Format file 


class MIDIPlayer : public MIDIFile { 
long division; 
CWnd* owner; 


// delta time ticks per quarter note 
// window to notify when sequence is done 


void StartTrack(int trkno) 


{ delta = @; } 


void EndOfTrack(Long delta) ; 
void TimeSignature(Long delta,Short numer, 


Short denom,Short clocks,Short qnotes) ; 


void NoteOn(Long delta,Short channel,Short note, Short velocity) ; 


void NoteOff(Long delta,Short channel,Short note, Short velocity) ; 


ook no further! The Essential Books on File Formats CD-ROM contains the 
[oo ptte text from six books, which will provide you with the most 
file formats in use today. Selected by the editors of Dr. Dobb’s Journal, this 
CD-ROM contains invaluable information on file formats used for 
graphics, multimedia, sound, databases, spreadsheets, Windows, the 


Internet, and much more! 


No matter what your programming focus, the Essential! Books on File 
Formats CD-ROM discloses the secrets and insider knowledge you need to 
make your programming instantly compatible with all the major 
applications out there right from the start. Plus, the CD-ROM's powerful 


full-text search engine 
and hyperlink 
capabilities allow 
you to search quickly 
and easily across all 
the books to link 
directly to the 
information 

you need. 


Text from 6 books — 
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// ----- private member functions 
void StoreEvent(const MIDIEvent& mev) ; 
void KillTimer(); 
void StopMIDI(); 
// ----- timer mechanism 
friend void CALLBACK TimerCallback(UINT, UINT, DWORD, DWORD, DWORD); 
void TimingMessage() ; 
static MIDIPlayer* pplay; 
// = "this" so TimerCallback can call TimingMessage 


public: 
explicit MIDIPlayer(std::ifstream& rFile) ; 
// ---- play SMF file with count measures of count-off, tempo of tmpo, and 
// metronome click if metr (if tmpo == 0, use tempo from SMF file) 


void Play(long tmpo, int count, bool metr) ; 
void StopPlay(); // stop playing the sequence 
void Reset(); // reset the midi system 
void RegisterWindow(CWnd* wnd) // register a window to 
notify when sequence is done 
{ owner = wnd; } 
void Metronome(bool onoff) 
{ metronome = onoff; } 
void ChangeTempo(long t) 
{ tempo = t; } 


// turn the metronome on or off 


ie 

#endif 

Listing Two 

// ---- midiplayer.cpp 


#include "stdafx.h" 
#include "midiplayer.h" 


MIDIPlayer: :MIDIPlayer(std::ifstream& rFile) 
{ 


: MIDIFile (rFile) 


hMidiOut = @; 
timer = Q; 
division = 5; 
owner = @; 
ticking = false; 
countoff = @; 
metronome = false; 
beatspermeasure = 4; 
} 
void MIDIPlayer: :EndOfTrack (Long) 
t 
if (track.size()) { 
// a track of realtime events has been accumulated, save it 
tracks. push_back(track) ; 
track.clear(); 
} 
} 
void MIDIPlayer::TimeSignature(Long delta,Short numer, 
Short denom,Short clocks,Short qnotes) 


{ 
beatspermeasure = numer; 
} 
inline void MIDIPlayer::StoreEvent(const MIDIEventé mev) 
{ 
delta t= mev.delta; 
DWORD dwEvent = mev.eventno | mev.channel | 
(mev.paraml << 8) | (mev.param2 << 16); 
MIDIData data = { delta, dwEvent }; 
track. push_back (data) ; 
} 


void MIDIPlayer::NoteOn(Long delta,Short channel,Short note, Short velocity) 
{ 
MIDIEvent mev = { delta, MIDI_NOTEON, channel, note, velocity }; 
StoreEvent (mev) ; 
} 
void MIDIPlayer::NoteOff(Long delta,Short channel,Short note, Short velocity) 
{ 
MIDIEvent mev = { delta, MIDI_NOTEOFF, channel, note, velocity }; 
StoreEvent (mev) ; 
} 
void MIDIPlayer::Controller(Long delta,Short channel,Short controller, Short 
value) 
{ 
MIDIEvent mev = { delta, MIDI_CONTROL, channel, controller, value }; 
StoreEvent (mev) ; 
} 
void MIDIPlayer: :ProgramChange(Long delta,Short channel,Short program) 
{ 
MIDIEvent mev = { delta, MIDI_PROGRAM, channel, program, @ }; 
StoreEvent (mev) ; 
} 
MIDIPlayer* MIDIPlayer::pplay; 
void CALLBACK TimerCallback(UINT, UINT, DWORD, DWORD, DWORD) 
{ 
MIDIPlayer: :pplay->TimingMessage() ; 
j 
void MIDIPlayer::Play(long tmpo, int count, bool metr) 
{ 
if (midiOutOpen(&hMidiOut, MIDIMAPPER, 0, OL, OL) == @) { 
nticks = fticks = 0; // integer representation of 
// integral and fractional parts of clock 
period = 1; // time slice in milliseconds 
clock = @; // accumulated time 
trtime = (period * 1000) * division; 
countoff = count * beatspermeasure + 1; 
metronome = metr; 
if (tmpo) 
ChangeTempo (tmpo) ; 
divetr = 9; 
for (int i = @; i < tracks.size(); itt) 
trkndx. push_back(@) ; 
pplay = this; 


// playing at a specified tempo 
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ticking = false; 
timeGetDevCaps(&tc, sizeof tc); 
timeBeginPeriod(tc.wPeriodMin) ; 
timer = timeSetEvent(period, tc.wPeriodMin, 
TimerCallback, @, TIME_PERIODIC) ; 

DWORD mmsg = Qxfa; //. start message 
midiOutShortMsg(hMidiOut, mmsg) ; 

} 


void MIDIPlayer: :TimingMessage() 


{ 


} 


if (hMidiOut) { 
ticking = true; 


// --- integral part of tick 
nticks = (fticks + trtime) / tempo; 
// --- fractional part of tick 
fticks += trtime - (nticks * tempo); 
// ---- process the count-off and the metronome 
if (divetr <= 9) { 
// --- at a quarter note beat 
if (countoff) { 
// --- in the count-off 
if (--countoff) { 
DWORD ev = MIDI_NOTEON ; 9 ; (37 << 8) ; (80 << 16); 
midiOutShortMsg(hMidiOut, ev); 
} 
} 
if (countoff == 9 && metronome) { 
// ---- play metronome (except during count-off) 


DWORD ev = MIDI_NOTEON ; 9 | (37 << 8) | (80 << 16); 
midiOutShortMsg(hMidiOut, ev); 


} 
divetr = division; 
} 
divetr -= nticks; 
if (countoff == @) { 
/ ---- sequencer code 
bool stillplaying = false; 
// --- scan the tracks for realtime midi events due for playing 
for (int i = @; i < tracks.size(); i++) { 
// --- see if there are more events this track 
if (trkndx[i] < tracks[i].size()) { 
stillplaying = true; 
MIDIData& ev = tracks [i] [trkndx[i]]; 
while (ev.delta <= clock) { 
// fire this event 
midiOutShortMsg(hMidiOut, ev.data); 
trkndx[i]++; 
if (trkndx[i] == tracks[i].size()) 
break; 
ev = tracks [i] [trkndx[i]]; 
} 
} 
} 
if (!stillplaying) 
StopPlay() ; 
clock += nticks; 
} 


ticking = false; 


} 


void MIDIPlayer: :KillTimer () 


{ 


} 


if (timer) { 
timeKillEvent (timer) ; 
timer = @; 
timeEndPeriod(tc.wPeriodMin) ; 


} 


void MIDIPlayer: :StopMIDI () 


{ 


} 


if (hMidiOut) { 
DWORD mmsg = Oxfc; // stop message 
midiOutShortMsg(hMidiOut, mmsg) ; 
midiOutClose(hMidiOut) ; 
hMidiOut = @; 

} 


void MIDIPlayer: :StopPlay () 


{ 


} 


KillTimer (); 
StopMIDI () ; 
if (owner) 
owner->SendMessage(MM_MCINOTIFY, 9, @); 


void MIDIPlayer: :Reset () 


{ 


KillTimer () ; 
while (ticking) // wait for TimingMessage to return 


iF (hMidiout) 


// --- all notes off, all channels 
DWORD ev; 
for (unsigned char channel = 9; channel < 16; channel+t+) { 


ev = Ox7bb@ | channel; 
midiOutShortMsg(hMidiOut, ev); 
} 
ev = Oxff; // system reset message 
midiOutShortMsg(hMidiOut, ev); 
StopMIDI(); 


DDJ 
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Can You Share JavaBeans? 


James Begole, Philip L. Isenhour, and Clifford A. Shatfer 


JavaBean is simply a component that 

conforms to the bean specification 

(http://splash.javasoft.com/beans/ 

docs/beans.101.pdf). Beans com- 
municate with each other through the use 
of properties, which are “named at- 
tribute(s) associated with a bean that can 
be read or written by calling appropriate 
methods on the bean.” Typically, a bean 
exposes its state through its public prop- 
erties. A property is declared by matched 
get and set methods. For example, an ob- 
ject can expose a property named back- 
ground by simply providing methods 
named getBackground and setBackground 
with appropriately typed return values and 
parameters. Using Java’s Reflection pack- 
age, JavaBeans allow dynamic querying 
and setting of properties at run time. This 
means that users (or another bean) can 
discover and modify a bean’s properties, 
which may in turn affect the appearance 
or behavior of the bean. The JavaBeans 
specification encourages developers to not 
only expose attributes as properties, but 
also to provide notification of property 
changes to registered listeners. When a 
property changes, a bean can generate a 
PropertyChangeEvent describing the 
change, and deliver it to registered listen- 
ers. A property that fires such an event is 
called a “bound property,” because its val- 
ue can be bound to a property of anoth- 
er object. By exposing the application state 
as bound properties, JavaBeans can easi- 
ly become collaborative. 


Collaborating Beans 

With minimal run-time support and some 
discipline on the part of the bean pro- 
grammer, most JavaBeans can be made 
collaborative. Our approach to collabora- 
tion is based on a replicated architecture, 
where each collaborator maintains a copy 
of the shared data. The JavaBeans mech- 
anisms described in the previous section 
give us the necessary tools to replicate 
component state at the granularity of 
properties. Because the JavaBeans frame- 
work can dynamically query and set 
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properties, you no longer need to encode 
state changes in a message protocol to 
be explicitly relayed to replicas. Instead, 
the framework can automatically extract 
the necessary information from a bound 
property’s change notification and con- 
struct a message for distribution. Using 
this approach, we have created a collab- 
orative JavaBeans environment called 
“Sieve,” shown in Figure 1 (available at 
http://simon.cs.vt.edu/ sieve/ and from 
DDJ, see “Resource Center,” page 5). Sieve 
listens for property changes from each 
bean on the workspace and, when a Prop- 
ertyChangeEvent is generated, Sieve pack- 
ages it in a message and sends it to the 
corresponding replicas. When such a mes- 
sage is received, the local replica of the 
changed component is found, and the 
changed property is set to the new value. 

Latecomers are brought up to date by re- 
playing the record of property changes. Re- 
play time is minimized by keeping only the 
most recent property changes. We prefer 
this to the alternative of sending a copy of 








a shared object using Java Object Serializa- 
tion (JOS), because JOS requires specific 
development effort and pausing for serial- 
ization would disrupt a collaborator’s work. 

It is possible for two or more collabora- 
tors to generate potentially conflicting prop- 
erty changes. To resolve this, we use the 
strategy described by A. Karsenty and M. 
Beaudouin-Lafon in “An Algorithm for Dis- 
tributed Groupware Applications” (Pro- 
ceedings of the 13th International Confer- 
ence on Distributed Computing Systems, 
IEEE Computer Society Press, 1993), where 
potentially conflicting operations mask, com- 
mute, or are order specific. Typically, prop- 
erty changes are maskable. That is, when 
two operations occur in sequence, the re- 
sult is the same as when only the last 
change is applied. Examples of typical 
maskable properties are color, length, po- 
sition, and size. Maskable properties are 
handled by ensuring that only the last prop- 
erty change is applied to each replica. 

In some cases, property changes are not 
maskable. They may be commutative, 


Sieve 


A Collaborative Interactive Modular Visualization 


Environment 


Figure 1: One collaborator’s view (Cliff's) of the HotJava browser bean shared 
via Sieve in the workspace window on the right. Each collaborator’s independent 
view is indicated by a uniquely colored and labeled rectangle in the upper-left 
“Radar View” window, which lets collaborators know each other’s locations. In 
this example, the users named Bo and Cliff are viewing the browser bean in the 
upper-left part of the workspace, while Isenhour is viewing other beans in the 
lower-middle area. Properties such as “Document String” may be edited directly 
on the browser bean or in the lower-left “Properties” window. 
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where the result of applying a series of op- 
erations is the same regardless of the order; 
or order specific, where the final state de- 
pends on the order. For both cases, com- 
ponent developers must be aware that the 
component will be shared. Instead of fir- 
ing a simple PropertyChangeEvent, the com- 
ponent must generate a specific type of 
event that is recognized by the Sieve envi- 
ronment, and must handle potentially con- 
flicting events. This can be trivially imple- 
mented by not applying the change locally 





until the corresponding message returns. 
More efficient algorithms are known, such 
as the distributed operational transforma- 
tion algorithms described by C. Sun and 
CS. Ellis in “Operational Transformation in 
Real-Time Group Editors: Issues, Algorithms, 
and Achievements” (Proceedings of the ACM 
Conference on Computer-Supported Coop- 
erative Work, ACM Press 1998). 

Many useful collaboration-unaware com- 
ponents may be quickly developed and 
shared with our approach. For example, 


Example 1: The code needed to share the browser bean. 
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facilitates rapid delivery of multi-tiered, thin-client Java 
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frameworks to speed development. We empower you by 
giving you the knowledge to do it again, on your own. 


Contact us today! 
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we implemented a collaborative wrapper 
for the HotJava Bean component that con- 
sists of less than 100 lines of code. Much 
of that code is to create the interface and 
combine the subcomponents of the Hot- 
Java browser. Example 1 shows the code 
that is needed to share this browser bean. 
Thus, littke work is required to create a col- 
laborative version of a powerful web 
browser. For some components, however, 
property changes may not provide suffi- 
ciently fine granularity. Consider a com- 
ponent that implements a simple text edi- 
tor. One approach to sharing it would be 
to expose the entire text content as a prop- 
erty. Any keystroke that changed the con- 
tent would then cause a property change, 
and the entire content would be propa- 
gated to all replicas. This approach might 
be sufficient for small chunks of text, but 
would not be desirable once the content 
grew to more than a few words. A better 
solution is to send only a description of 
each change. JavaBeans encourages com- 
ponents to generate an event as notifica- 
tion of any such change. This event can be 
distributed using the same mechanism as 
a PropertyChangeEvent, and if it contains 
sufficient information, a collaboration-aware 
wrapper can merge the change into each 
replica’s content. Component-specific col- 
laboration support code must be written 
to handle the event, but the component it- 
self remains collaboration unaware. 


Conclusion | 

When a component's state can be effi- 
ciently defined by a finite set of bound 
properties, it can be shared with no extra 
development effort. A collaborative ver- 
sion of the HotJava browser illustrated that 
significant software components may be 
shared with little implementation effort. 
For those cases where collaboration-aware 
components are desired, mechanisms can 
be developed explicitly to create efficient 
collaborative components. 

Available electronically is an HTML 
browser (see Browser.java) that combines 
the HotJava HTML components (available 
separately at http://www.sun.com/software/ 
htmlcomponent/). Also available electron- 
ically you will find: TextBean.java, a sim- 
ple text input bean; BrowserApp.java, a sim- 
ple application that runs the browser; 
make-it.bat, a DOS batch file to compile 
the above java files; and runit.bat, a DOS 
batch file to run BrowserApp. 
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Efficiently Sorting Linked Lists 
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any sorting algorithms require you 
to go through a fixed set of oper- 
ations, even if the input data is al- 
ready mostly sorted. I wanted to 
develop an algorithm for sorting, using 
linked lists, that capitalizes on already or- 
dered subsequences. In particular, a list 
that’s nearly ordered should sort very 
quickly. 

I also wanted a routine that would work 
well even when the data was in the worst 
possible order. Quicksort has been used 
for years, and has been considered to be 
the best overall sort for randomly dis- 
tributed data. However, quicksort is slow— 
approaching OUV*)— when the data is 
ordered (either forward or backward). In 
addition, the algorithm is defined and im- 
plemented recursively, which results in 
more overhead and, hence, slower run 
times. 

My starting point for the development 
of this algorithm was the merge sort, de- 
scribed in many algorithm books. My al- 
gorithm is similar to Donald Knuth’s “list 
merge sort,” except that Knuth uses an ex- 
tra bit to mark the ends of sublists, where- 
as I look for patterns in the data itself to 
identify sublists. 





Bill is a professor of computing science 
at the University of Central Oklahoma. 
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Sorting by Sections 

Many lists to be sorted already have sec- 
tions that are in the correct order. Sup- 
pose you took a list with R sorted sections 
and split it into separate sorted lists, as in 
Figure 1. 

The basic idea behind my algorithm is 
to merge successive pairs of lists. Each 
such merge requires OU) time, since it 
requires examining every item. Each merge 
pass halves the number of lists so you 
make a total of /og(R) passes. Note that 
Rk, the number of sorted sublists, is always 


less than N, so the total time is never more. 


than OW log N). And, if the original list 
was mostly sorted, R will be very small 
and the algorithm will complete quickly. 


The McDaniel Sort 
Instead of having R lists, my algorithm 
(which I refer to as the “McDaniel Sort”) 
uses just two output lists. As Figure 2 
shows, the first one holds the even sec- 
tions, the second holds the odd sections. 

With this arrangement, I only need four 
lists — two input lists (Figure 2) and two 
output lists (Figure 3). During each pass, 
I merge sections from the input lists and 
alternate the resulting sorted sections in 
the output lists. Figure 3 shows how I 
merge LO and L1 to make L01, merge L2 
and L3 to make L23, and so on. 

The tricky part is knowing where one 
group (LO and L1) ends and the next 


group (L2 and L3) begins. This can be de- 
termined by comparing the first elements 
in both input lists to the last element sent 
(to either output list). If the first elements 
in both input lists are less than the item 
just output, this indicates the start of a 
new group. 


Detailed Algorithm 

The loop at the bottom of Example 1 is 
the essential part of my algorithm. In each 
iteration of this loop, you can choose the 
first element from either input list and 
place that element at the end of either out- 
put list. Remember that you want to con- 
struct long sorted sequences. If only one 
of the candidates is larger than the last 
item output, then that’s the only one that 
can continue the current sorted sequence. 
If both candidates are larger, then choos- 
ing the smaller of the two will let you con- 
struct a longer sequence. If both candi- 
dates are smaller, then you have to start 
a new sequence, so choosing the smaller 
will let that new sequence be as long as 
possible. 

Of course, since you want to alternate 
sorted sequences between the two output 
lists, whenever you start a new sequence, 
you switch output lists. 

Listing One presents the complete al- 
gorithm. The central loop involves com- 
paring new candidate elements to the last 
element output. This requires that you 
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Figure 1; Each section, LO, L1, 12, and so on, is already sorted. (a) Original list; 
(b) split into separate lists. 
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Figure 3: After the first merge pass. 
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Example 1: The algorithm. | 
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to traverse a linked list. But, if you 
need to modify the list oo text- 





“next” field oF te previ 5 ) 
ever, this complicates inserting or delet- 
ing the first item, since you then must 
update the list head rather than the next 
field of some item. 2 
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Figure 2: Splitting a list by 
alternating sorted sections. 


initialize the first output list with the first 
element from one of the input lists. The 
second output list is initially set to empty. 

The outermost do/while loop retrieves 
items from the two input lists and sends 
them to the output lists. The code inside 
this loop constitutes one pass. When the 
loop is finished, the second output list is 
checked. If the second output list (oui/1/) 
is empty, then all of the items were rout- 
ed to the first output list and the items are 
all sorted. If the second output list is not 
empty, then another pass is required. 

The code to select an item from the cor- 
rect input list is fairly complex because 
you have to determine the relative order 
of the last item output and the first item 
in each input list. Whenever you select an 
input item smaller than the last output 
item, you have to switch output lists. 

The algorithm is surprisingly fast. The 
best case behavior occurs with a list whose 
items are already sorted. The sort will be 
finished after one pass. The worst case is 
when the items are in reverse order. For 
a list of N items, it will take Jog,(V) com- 
plete passes to sort the items, making the 
order of this algorithm N*log,(). 


The Sample Program 

I've implemented the code in C for clari- 
ty. This code illustrates several program- 
ming ideas, one of which involves pass- 
ing the address of the comparison routine 
as a parameter. This lets the code sort a 
linked list of virtually any structure, as long 
as the first field of the structure is a point- 
er to the next item. 

To use the Sort function, you must first 
write a compare routine (comp) that ref- 
erences any user-defined fields of the 
structure. This function contains a 
“less_than” test, that returns True (1) if the 
item pointed to by the first parameter is 
less than the item pointed to by the sec- 
ond parameter; otherwise, it returns False 
(0). The parameters to comp are void 
pointers simply because comp is not yet 
written. The function Z7_comp in Listing 
Two is a good example. Listing Two is a 
sample program that sorts a text file. 

This sort works well for random data, 
data that has sorted subsequences, and 
data that has a lot of duplicate keys. It can 
be used in a variety of different situations, 
and— unlike quicksort—- does not involve 
recursion. 


DDJ 
(Listings begin on page 128.) 
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Listing One 


typedef struct GenericNode GenericNode; 

struct GenericNode { GenericNode *next; }; 

void Sort(void ** pList, int (*comp) (void *, void *)) 
{ 


int outindex; 


/* current output list (@ or 1) */ 


GenericNode *p; /* Scratch variable */ 
GenericNode *in[2], *out[2]; /* Input/Output lists */ 
GenericNode **outTail [2]; /* Track last items in output lists */ 
GenericNode *lastOut; /* Last node output */ 
if(!*pList) return; /* Empty list is already sorted */ 
out [@] = *pList; . /* point out[@] to the list to be sorted */ 
out[i] = 9; 
do { 
in[@] = out[@]; /* Move output lists to input lists */ 
in[1] = out[1]; 
if (!in[1]) { /* Only one list? Grab first item from other list */ 
p = in[@]; if(p) in[@] = in[@]->next; 


} else { /* There are two lists, get the smaller item */ 
int smallList = comp(in[@],in[1]) ? 0: 1; 
p = in[smallList]; if(p) in[smallList] = in[smallList]->next; 
} 
/* Initialize out[@] to first item, clear out[1] */ 
out[@] = p; outTail[@] = &(p->next); lastOut=out [9] ; 
p->next = (GenericNode *)Q; 
outindex = 9; 
out[1] = (GenericNode *)@; outTail[1] = &(out[1]); 
while (in[@] ;; in[1]) { /* while either list is not empty */ 
if (!in[1]) { /* Second list empty, choose first */ 
p = in[@]; if(p) in[®] = in[®@]->next; 
if (comp(p,lastOut) ) /* p < lastOut */ 
outindex = 1-outindex; /* switch lists */ 
} else if (!in[@]) { /* First list empty, choose second */ 
p = in[1]; in[1] = in[1]->next; 
if (comp (p,lastOut) ) /* p < lastOut */ 
outindex = 1-outindex; /* switch lists */ 
} else if (comp(in[@],lastOut)) { /* in[@] < lastOut */ 
if(!comp(in[1],lastOut)) { /* lastOut <= in[1] */ 
p = in[1]; in[1] = in[1]->next; 
} else { /* in[{1] < lastOut */ 
if(comp(in[@],in[1])) € /* in[@] < in[1] */ 
p = in[@]; in[@] = in[@]->next; 


} else { 
p = in[1]; in[1] = in[1]->next; 
} 
outindex = 1-outindex; /* Switch lists */ 
} 
} else { /* lastOut <= in[0] */ 


if(comp(in[1],lastOut)) { /* in[1] < lastOut */ 
p = in[®]; in[@] = in[@]->next; 
} else { /* lastOut <= in[1] */ 
if(comp(in[@],in[1])) € /* in[@] < in[1] */ 
p = in[@]; in[@] = in[@]->next; 
} else { 
p = in[1]; in[1] = in[1]->next; 
} 
} 
} 
*outTail [outindex] = p; 
outTail[outindex] = &(p->next) ; 
p->next = (GenericNode *)@; 
lastOut Dp; 
} 
} while (out[1]); 
*pList = out[@]; 
, 


Listing Two 


/* llsort.c. Sort 1 or more lines of text. 
* Usage: llsort <infile> <outfile> <optional sort column> 
ae column - defaults to 1. 
* 
#include <stdio.h> 
#include <stdlib.h> 
#include <string.h> 
#define maxLength 1000 /* n is the maximum length of an input line */ 
/* If you're sorting on the first column, sortPointer will point to first 
* character of 'info'. Otherwise, it will point further into the string. */ 
typedef struct MyNode { 
struct MyNode *next; 
char *sortPointer; 
char info[1]; 
} MyNode; 


/* Pointer to string to be sorted */ 
/* String data */ 


int LT_comp(void *a, void *b) { 
char *p=((MyNode*) a)->sortPointer; 
char *q=((MyNode*)b)->sortPointer; 
return (stremp(p,q) < @); 
} 
int main (int argc,char **argv) 
{ 
FILE *infile, *outfile; 
MyNode *p, *list, **pTail; 
long int sort_column = Q; 
char st[maxLength], infn[256], outfn[256]; 


/* True if a<b */ 


if (arge < 2) { 
printf("Usage: %s infile outfile [number] \n",argv[@]); 
exit (1); 

} 


/* pick off the file names */ 
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strepy( infn, argv[1]); 
strepy(outfn, argv[2]); 


/* pick off the starting sort column (if it exists) */ 
sort_column = @; 
if (arge == 4 ) sort_column = atol(argv[3])-1; 


/* open the files */ 
infile = fopen(infn,"r"); 
if (!infile) { 
printf("File %s could not be found.\n",infn) ; 
exit(1); 
} 
outfile = fopen(outfn,"w"); 
if (!outfile) { 
printf("Output file %s could not be opened. \n",outfn) ; 
exit (1); 
} 
/* initialize the list */ 
list = 0; pTail = &list; 
/* read the input file and build the linked list */ 
while (fgets (st,maxLength,infile) ) /* get one line */ 
{ 
/* fetch a node that is just the right size */ 
p = malloc(sizeof(MyNode)+strlen(st)+1) ; 
if(!p) { 
fprintf(stderr,"Out of memory!"); 
return 1; 
} 
/* copy the string into the info portion of the node */ 
strcepy(p->info,st) ; 
/* sortPointer points to the part of the string being sorted */ 
if (strlen(p->info) < sort_column) { 
p->sortPointer = ""; /* Too short, treat as empty string */ 
} else { 
p->sortPointer = p->info + sort_column; 
} 
/* insert the node onto the tail end of the list */ 
*pTail = p; pTail = &(p->next); 
J 
*pTail = 0; /* Terminate list with null */ 
fclose(infile) ; 


printf("Sorting: %s by column %ld\n",infn, sort_columnt1) ; 
Sort ((void**) &list, LT_comp); 


/* Send the sorted data to the output file. */ 
p = list; 
while(p) { 

fputs(p->info,outfile) ; 

p = p->next; 


fclose(outfile) ; 
return @; 


Listing Three 


/* Igor Kolpakov's "reverse pointer" trick */ 
typedef struct _ITEM ITEM; 
struct _ITEM { 


} . 


int key; 
ITEM * next; 


ITEM *listHead; 
AddItem(ITEM *newItem) { 


} 


/* Keep a 'reverse pointer' to the pointer to this item */ 
ITEM ** rpItem = &listHead; 
ITEM * item = listHead; 


while(item && (key > item->key)) { 
rpItem = &item->next; 
item = item->next; 
} 
/* Note: No extra tests!! */ 
*rpItem = newlItem; 
newltem->next = item; 


DeleteItem(int key) { 


} 


ITEM ** rpItem = &listHead; 
ITEM * item = listHead; 


while(item && (key > item->key)) { 
rpItem = &item->next; 
item = item->next; 

} 

if(item && (key == item->key)) { 
/* Note: No extra tests!! */ 
*rpItem = item->next; 
free( item ); 


} 


DDJ 
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> istinguished in his Armani tailored 
§ suit, Rino Banti showed us his badge: 
| Servizio Speciale, the Italian Secret 
' Service. 

“There is a little known secret about 
art in Italy,” he explained. “All people 
have seen Michelangelo, Da Vinci, Tin- 
toretto, but what you see is a tiny por- 
tion of what was produced. Looters from 
all over Europe have taken the patrimo- 
ny of Italy for their own enjoyment. Not 
to mention local thieves. Sometimes the 
looting is destructive, as in the case of 
the splendid Arabic-Venetian marble rect- 
angle that is the subject of my visit. 

“The work of art we’re seeking be- 
longed to one of the great Doges of 
Venice, Enrico Dandolo, the blind con- 
queror of Constantinople in 1204. (His 
role in the conquest may have been ex- 
aggerated. The fighters of the fourth cru- 
sade conquered the city and turned it over 
to Venice in payment for the warships 
Venice had provided.) The artwork is a 
geometric design in a rare turquoise mar- 
ble. Dandolo couldn’t see the turquoise, 
but it is said that he relished the touch of 
the lines carved in the marble. 


Dennis, a professor of computer science at 
New York University, is the author of The 
Puzzling Adventures of Dr. Ecco (Dover, 
1998), Codes, Puzzles, and Conspiracy 
(WH. Freeman G Co., 1992), Database Tun- 
ing: A Principled Approach (Prentice Hall, 
1992), and (coauthored with Cathy Lazere) 
Out of Their Minds: The Lives and Discov- 
eries of 15 Great Computer Scientists 
(Springer Verlag, 1998). He can be con- 
tacted at DrEcco@ddj.com. 
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ECCO'S OMNIHEURIST CORNER 


“This original square measures 16x16, 
and consists of smaller rectangles and 
triangles. These smaller pieces have been 
squirreled away in private collections all 
over the continent and some even in the 
United States. Most of the people who 
have them deny any knowledge of 
wrong-doing and we cannot, franca- 
mente, accuse them of lying. The theft 
occurred long before the pieces in ques- 
tion fell into the hands of their current 
owners. 

“What’s worse is that the owners 
protest molto about the prospect of giv- 
ing up their works of art. Sometimes 
they have paid dearly for them. We 
think, however, that we can convince 
the courts that the correct pieces come 
from the stolen rectangle provided we 
can prove that those pieces reconstruct 
the rectangle and no combination of the 
other turquoise pieces that we have lo- 
cated can possibly reconstruct the rect- 
angle. 

“The trouble is that we don’t know 
which are the correct pieces. At least 
some of the pieces we have located 
must belong to lesser works of art, be- 
cause our mathematicians tell us they 
cannot form a rectangle of the correct 
size. Our problem is to figure out which 
pieces can possibly make up the rect- 
angle and how to form the rectangle 
from them. Then we must show that no 
other combination of pieces can fit the 
rectangle.” 

“Geometry!” Liane exclaimed. “I’ve been 
longing for a geometrical puzzle.” 

“So have I,” Ecco said, smiling. “Signore 
Banti, please tell us the sizes of the pieces 





available as well as the size of the desired 
rectangle.” 

“Grazie molto, Dr. Ecco,” Banti re- 
plied. “They told me you help only when 
you find the problem engaging. Please 
see this figure (see Figure 1) for the 
SiZCS ix 


Reader: Find a subset of these shapes 
that form the rectangle. Show how to 


form the rectangle (please use JPEG and 


send me the web page if you submit a so- 
lution). Also, show that no other combi- 
nation of shapes can possibly form the 
rectangle. 


Last Month’s Solution 

The first problem is to find a mapping 
from dried fruit to page numbers. Here is 
what Ecco found: 


coconut 1 
prune 2 
date 3 
grapefruit 4 
raisin 5 

fig 6 
currant 7 
pineapple 8 
apricot 9 


There were two reversed edges as Simon 
had feared. They are: 


® apricot pineapple, which should have 
been pineapple apricot; 

e raisin coconut, which should have been 
coconut raisin. 


When we lay out the text in order of 
page number and concatenate, we will get: 
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Figure 1: The layout problem: You are given a 10X16 square. You want to fill it 
with rectangles and triangles. Available rectangles: 2x2, 4x3, 7*5, 16x3: 

VC SIXV o2), 4 1S)XV S), V(288)x VC 32), All triangles are right isosceles triangles. 
For every rectangle with an irrational side, there is a triangle with a hypotenuse 
that matches that side. For every integral side of a rectangle, there is a triangle 
one of with a nonhypotenuse side that matches it. So, there may be multiple 
triangles of 2X2xW(8), 4K4xV(32), 3X3xsVW18), 7X7KWG8), 5x 5<W50). 


16X16KW512), and 12X12xV288). 


“onlm4beurmxSmram8nb8vmhwwdddm 
3r8a7m9rgm9xu6gllm7rmar6mvxun 
6mxomé6rrmlbevp” 


The corresponding cleartext is: “the car- 
go is on uhaul vxx222 bound for figtree 
do not light it too early.” 

With the information I’ve given you, it 
is impossible to figure out that 222 is the 
number. I will report in a future column 
how close some readers were able to get 
to this decoding. 


Solutions to the Sultan's Problem 
Several readers matched or improved upon 
Liane’s solution to the Trains for the Sultan 
problem (DDJ, March 1999), including Dr. 
Burghart Hoffichter, James Waldby, Chris 
Rosenbury, and D.F. Curran. 

Several other solutions improved the 
scheduling by causing trains to wait at cer- 
tain stations, thus leaving the tracks clearer. 

James Waldby found a solution to the 
first problem (50 people trains) that re- 
duces the time from 105 minutes (Liane’s 
solution) to 90+3€ where € is the time 
to wait. For convenience, let’s make € 
represent one minute. Typical trains do 
two minute stops. 


Route 
Taken 


AEF, wait one minute, FRBCBG 
BAEFBAE 

CBD DBC€ECBD 

DBCBD, wait two minutes, DBC 
EFBAEFB 

FBAEFBA 

GBAE, wait one minute, EFBD 


Start 
Time 


NH O:Oo © © © 
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Chris Rosenbury proposed the follow- 
ing when five trains can carry 150 pas- 
sengers. The goal is to reduce the size of 
station B to two platforms by staggering 
train departures. 


Start Route 

Time Taken 

0 EFBD (150 people) 

0 AE, wait 15 minutes, then EF 
(150 people) 

0 FBAE (150 people) 

0 GBA (50 people) 

5 CBG (50 people) 

10 CBD (150 people) 

10 DBC (150 people) 

15 FBC (50 people) 


D.F. Curran (DFCurran@aol.com) pro- 
posed a different layout with six tracks 
instead of seven and two extra platforms 
in station C. He further proposed switch- 
es in the middle of a track to allow trains 
traveling in the opposite direction to 
pass one another in the middle of a 
track. Un Zurich, two-way traffic on one 
track is accomplished by building a short 
stretch of two tracks in the middle. 
Trams moving in the opposite direction 
are timed so they pass each other at ex- 
actly this section of track.) He was then 
able to show that 60 minutes was 
enough. 


DDJ 
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content by the editors of DDJ, these books on 
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of all eight books. 
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PROGRAMMER’S BOOKSHELF 


Greatest 


Hits of the ‘70s 


Gregory V. Wilson 


s the saying goes, there’s good news 

and bad news. The good news is 

that if I switch on the radio, I have 

a reasonable chance of hearing the 
Doobie Brothers singing “Black Water.” 
The bad news is that kids are wearing 
flares and platform shoes again. 

Luckily, there’s more good news as 
well— Brian Kernighan and Rob Pike have 
a new book out. Their initial joint effort, 
1984's The Unix Programming Environment, 
was the first comprehensive introduction to 
the standard UNIX toolset, and in many 
ways remains the best. It, and three other 
books with Kernighan’s name on the 
spine —The C' Programming Language, Soft- 
ware Tools, and The Elements of Program- 
ming Style—had a lot to do with UNIX be- 
coming the world’s most popular operating 
system. (Yes, I know Windows is more 
widely used: I said “popular” on purpose.) 

The Practice of Programming recapitu- 
lates and updates the best parts of those 
four books. Coding style, interface design, 
testing and debugging techniques, and 
ways of improving program performance 
are discussed lucidly and authoritatively. 
As a bonus, some of the examples are im- 
plemented in two or more of C, C++, Java, 
Awk, and Perl, so that Kernighan and Pike 
can compare and contrast those lan- 
guages’s strengths and weaknesses. 

Upon reflection, however, two things 
about this book left me feeling slightly de- 
pressed. The first is how little software de- 
velopment practices have changed in 20 
years. Most programs are still written with- 
out ever having really been designed, and 
tested haphazardly if at all. In a lot of 
ways, we have changed less since the 
1970s than the big car manufacturers or 
grocery chains. The second thing that de- 
pressed me is that I'll probably be able to 


Greg is the author of Practical Parallel Pro- 
gramming (MIT Press, 1995), and coedi- 
tor with Paul Lu of Parallel Programming 
Using C++ (MIT Press, 1996). Greg can be 
reached at gvwilson@interlog.com. 
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Donald J. Becker, and 

Daniel F. Savarese 

MIT Press, 1999 
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say the first thing again 20 years from now, 
when the Doobie Brothers are back on 
the radio for the third time... 

The way we program might not change 
in the next two decades, but the hardware 
our programs run on is bound to. How to 
Build a Beowulf, by Thomas L. Sterling, 
John Salmon, Donald J. Becker, and Daniel 
F. Savarese, provides a glimpse of what 
that hardware might be like. A “Beowulf” 
is a Supercomputer constructed from mass- 
produced PC components, and running 
freely available software. The term comes 
from NASA’s Beowulf project, which built 
the first such machine in 1994. Today, for 





hittp://www.ddj.com 


> Electronic Review of 
Computer Books 





a couple of hundred thousand dollars, you 
can build a machine that has 200 state-of- 
the-art microprocessors, several gigabytes 
of RAM, and a fast interconnection net- 
work based on any of several switching 
technologies. Such a machine is not only 
more powerful than anything that existed 
in the world a decade ago, it is probably 
also more reliable, since its hardware and 
software are both mass-market products. 

How to Build a Beowulf discusses ev- 
erything from the choice of processors 
for such a machine (x86 compatible is 
the favorite choice, with the DEC Alpha 
a close second), through installing and 
configuring Linux, to network protocols, 
security, and application programming. 
Mass-produced components like PC moth- 
erboards and memory chips are part of 
what makes Beowulf-class machines pos- 
sible, but free software like Linux, the 
GNU compilers, and the Message-Passing 
Interface (MPI) are just as important. As 
web server stats show, Linux is already 
more stable than many commercial oper- 
ating systems; that stability is crucial if you 
are trying to keep 200 or more instances 
of the OS up and running simultaneous- 
ly. Similarly, the openness of MPI has freed 
programmers from dependence on the 
proprietary (and usually short lived) pro- 
gramming systems foisted on them by 
vendors in the bad old days of the 1980s 
and early 1990s. 

Programming a massively parallel ma- 
chine is still no easy task, as the failure of 
the parallel computing start-ups of the last 
two decades shows. Now that machines of 
this kind are within the reach of medium- 
sized companies and academic depart- 
ments, however, and given the speed with 
which enterprise applications like Oracle 
are being ported to Linux, I expect that 
parallel computing is finally going to go 
through the “phase change” that hit desk- 
top computing in the early 1980s. Of 
course, I said the same thing in 1989... 

This month’s third book—Developing 
Visual Basic Add-ins, by Steven Roman — 
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is narrower than How to Build a Beowulf, 
but will probably be of more immediate 
use to a lot of programmers as a result. All 
major Windows tools expose large parts of 
their functionality through a COM interface. 
This allows developers to call on such things 
as Microsoft Word’s spelling checker, or Ex- 
cel’s calculation engine, from applications 
written in C++, Visual Basic, or even Perl. 
It also allows you to add functionality to 
those tools, and in particular to extend 
Microsoft Developer Studio by adding your 
own buttons, toolbars, and windows to it. 


How to Build a 
Beowulf discusses 
everything from the 

choice of processors 
to application 
programming 





Microsoft’s own descriptions of how to 
do this are contradictory and incomplete, 
but that’s where Roman’s book comes in. 
After a short introduction, Roman dives 
into the specifics: what a basic add-in has 
to provide, how it can register itself, how 
to add menus, how to handle events, and 
so on. Marginal flags show which pieces 
of information are VB5 or VBO specific, 
and this information on its own almost 
justifies the cost of the book. Finally, at 
171 pages (not counting a few pages of 
advertising at the back), the book has the 
almost unique property among Visual Ba- 
sic books of being small enough to hold 
comfortably in one hand... 

The final book this month is Graph 
Drawing: Algorithms for the Visualiza- 
tion of Graphs, by Guiseppe di Battista, 
Peter Eades, Roberto Tamassia, and Ioan- 
nis G. Tollis. The title is an accurate sum- 
mary of the book’s contents, but doesn’t 
do justice to its breadth. Section 5.1, for 
example, is devoted to angles in orthog- 
onal drawings, while Chapter 7 covers 
incremental construction techniques. The 
style is academic — there are a lot of ref- 
erences, and a lot of proofs and lem- 
mas— but the book will be a rich mine 
of ideas for anyone who is trying to per- 
suade a computer to turn data into dots, 
boxes, lines, and arrows. 


DDJ 
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: Pyrus N.A., Ltd 

: www.font.to/mac.htm 

- QNX 

: WWW.gnx.con 

: Quadron 139 : 
: wWww.quadron.com 

: Quinn-Curtis 

: www.quinn-curtis.com 

: Rainbow Technologies 

: www.rainbow.com/reddj 

: Rational Software Corp 
www.rational.com/tools/purify 

: Rational Software Corp 


: www.rational.com/tools/rosmsn/ 


: Reliable Software Technologies 69 | 
: www.rstcorp.com 

: Research Systems 

: www.rsinc.com 

: Rogue Wave Software 7: 


: WWW.roguewave.com/ad/catch 





> Rogue Wave Software 
: WWW.roguewave.com/ad/vc 


: RSA Data Security 
> www.rsa.com/win 
: International: www.rsa.com/race 


: Real-Time Innovations, Inc 


www.rti.com 


- Sandstone Technology 
: www.sand-stone.com 


- Scientific Tool Works 

: www.scitools.com 

: Sequiter Software, Inc 

www.sequiter.com 

- Softel vdm, Inc 

48 : 
: §D Bootcamp Series 


www.softelvdm.com 


www.sdexpo.com/bootcamp 
Software FX, Inc 


> www.softwarefx.com 


: §Sun Microsystems 

14 : www.sun.com/desktop/ 
: Symantec Corp 

81 : cafe.symantec.com/vc3/offer 
: TAL Technology 

: www.taltech.com/ddj.htm 


Teamshare, Inc 
: www.teamshare.com 
83 : 
: Technetcast.com 
: www.technetcast.com 
36 : 
Treck, Inc 
47 : www.treck.com 
: UCSC Extension 
95 : www.ucsc-extension.edu 
: VideoSoft 
> www.videosoft.com/vsview.htm| 


_ WIBU-SYSTEMS 


> Wwww.wibu.com 


29 : 


65 


85 


102 
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C3 
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137 


68 
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C4 


139 


73 


89 


72 


72 


95 


70 


: Web Design & Development’99 123 
>: www.mfweb.com 
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Printing 
and reports 








Program with 
pictures, not code! 


Use with ACT!, Word, 
Excel, Powerpoint, or 
any COM-enabled 
application! 
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Need high speed ?... check out 
our faster comm cards ! 


T1/E1, fractional T1, : Easily add professional bar 
Frame Relay TCP/IP | coding capabilities to Windows 
X.25 BX25. LAP-B y | 3.1, 95 and NT applications. 
anes ’ -— Royalty-Free. 
CSU/DSU, async, SNA, Create extremely high qualit 
bisync, HDLC/SDLC, & | device independent, WMF 4" 23456'78901 
custom protocols | graphics. Not fonts! Not bitmaps! 
fax 805-966-7630 < : ° | 800-722-6004 


telephone 805-966-6424 ; . 
email info @quadron.com WWwiw.quaaren.com | www.taltech.com/ddj.htm 


REACH OVER 
16,000 


Phone: +49 6071 951706 
Fax: +49 6071 951707 
E-Mail: info@Schaudin.com 


Translating through visual resource views 

Change Control 

Translating of text and embedded ActiveX controls 
Resource text dictionary 

Integrated dialog layout editor 

Distributed translation 

For all Microsoft Windows supported languages 
Easy to use 
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hipping/Handling 
Up to $25 $5.00 $7.50 | 


$25.01-$35.00 $6.50 $8.50 
$35.01-$50.00 $8.00 $9.50 
$50.00 and up $9.50 $10.50 










































































“www — di nkumware.com 
+1 888 4DINKUM 


for Windows® CE 


Just the right size library 
for embedded systems. 


Dinkumware, Ltd. 
Genuine Software 


- Supports Oracle, - Extends MFC with 
MS SQL Server, Access, — hundreds of functions 
SQL Anywhere, DB2... 


- Comprehensive security 
and administration 


- Complete error handling 
and tracing 


- Crystal Reports/Pinnacle 
mi ean 
eae Internet Server 


www.boic.com 


Base One International Corporation 


212-691-7155 


SSCS SSSSOTESOSESOOSSOOSESESSOoS 


- Uncompromisingly exact - Compact representation 
decimal arithmetic minimizes memory & disk 

- Up to 100 significant digits pe ee ae 
with decimal point at any — - Well suited to database apps, 
desired digit with support of nulls (blanks) 


- Efficient arithmetic on both - Fully documented ANSI C++ 
very large and very small source code and examples 
numbers 

U.S. Patent Pending 


- Blazingly fast comparison 
logic for searching, sorting E 
& indexing www.boic.com 


Base One International Corporation 


#8 212-691-7155 
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htm1++ = supports 
CGI Class Library |ISAPI 


Compatible with all internet web servers 
Generate interactive web pages in C++ 
Ideal for webitying databases 
No more Perl or scripting 
Automates CGI, cookies, forms, state 


Win32, Winl6, OS/2, DOS, Unix, Mac 


FREE DEMO Call 1-800-775-1073 


. Tel (678) 442-1623 
DC Micro Fax (678) 442-1819 
a= Development —_ wwwdemicro.com 


mime ++ 


“a most complete and essential class library” 


... and around 
the world! 


Licensed by 
Fortune 500 companies ... 


-¥ document object model for MIME 

C++ library 

V fully standards compliant (RFC 822, 2045, 2046, & more) 
y SMTP, POP, NNTP 

V source code available 


Hunny Software ¢ (301) 948-6999 
www.hunnysoft.com/MIMEPP 








Zm4P TOOLS 
Easily add royalty-free data compression to 
all of your Windows applications with: 


DynaZIP Active 


Delivery © 
Compression Tools Self-Extract Zip Tools 
e ActiveX/DLL/VCL interfaces, full samples & doc’s. 
Most reliable components, millions in use daily. 
Fully supports Active Server Page(ASP) websites 


New $149 ActiveX version available, great value! 
Download your free eval copy! 


UT ALL IAL LEAL 
800-962-2949 (USA) 


= || fmincmancel mere, 


Predict Software Speedup 


FREE software accurately predicts the 
code speedup possible from 
parallelization. 


Download at 
http://www.myrias.com/predictor/free 


Call (780) 435-1000 
Myrias Software Corp. 


NEW! 


Code Co-op... Version 2.0 


The versatile Version Control System for 
collaborative development 


® Synchronization using email, local network, floppy disk 
® Intuitive GUI -- check-in, check-out, synch, visual diff. 
* Fully functional trial version available for download 


ee) =] ald nM atoto 4 =i te) @)\ mene) \ inne) m 
www.relisoft.com 


Reliable Software, 


Smart Tools for Smart Programmers,, 


» New -Version 2.1. 
eo. safe ape » Generates documen- 
tation. directly from the 
source code. 
p> Extracts comments. 
p> User customized 
reports formats. 
> HTML, WinHelp, 
RTF. 

pm FREE working 


evaluation at 
1-888-646-1933 www.bbeesoft.com 


Bumble Bee Software ee 
P.O. Box 2007 K 
Westford, MA 01886 D 


fo@bbeesoft.com 


Generate Documentation 
from your source code with DocJet ! 


Produce HTML, 


MSHelp, and You can fine- 
MSWord : : tune your output 


with DocJet’s 
WYSIWYG | 
output edito 


- documentation from 
comments in your 
code - and you won’t | 
eed to change your § 
commenting style! 


FREE TRIAL VERSION! 


http://www.tall-tree.com 


info@ 
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Earn B.S. and M.S. in Computer Science AMERICAN 
¢ NEW B.S. program in Information Systems INSTITUTE 
¢ Distance Education 
* Object oriented B.S. program Seite 
¢ Approved by more than 275 companies TTD: 
¢ Follows ACM/IEEE guidelines STATE LICENSED 
* Thousands of students throughout U.S. sae ilies 

ACCREDITED 


World Association 


Free catalogue 1-800-767-AICS peprieninis 
or www.aics.edu and Colleges 












VICTOR 


Image Processing Library 


Fast BMP, TIFF, PCX, GIF, TGA, PNG, JPEG. Adjust 
brightness, contrast, sharpen, create filters, resize, rotate, 
+more of single image, multiple images, or any image area; 
color reduction to optimum, specific, or std. palette; print: 
scan; crop, combine, compare, blend images. 


DOS $199, 16-bit DLL $299, 32-bit DLL $499 


Catenary Systems 
314-962-7833/fax: 314-962-8037 
www.catenary.com/victor 
ask for free demo src avail visa/mc/c.o.d. 




















Build professional Java apps ... 
visually on the platform ae 


of your choice. 
eT 


FREE TRYOUT - 100% Pure Java IDE 
































WANTED 


SOFTWARE PROGRAMS 


LOQking for partners in profit! 


You provide the completed Small Business, SOHO, Financial, 
Personal Productivity or “how-to” software program. E-Z Legal 
Software will provide packaging, duplication, marketing and 
Sales personnel to generate royalty checks for you. 


With distribution in more than 6500 retail outlets, we have 
the experience to launch, market and sell your program at retail 


at no cost to you! Dennis Liptrot: 1-800-822-4566 


384 S. Military Trail, Deerfield Beach, FL 33442 
Phone (954) 480-8933 ¢ Fax (954) 480-8906 
http://www.e-zlegal.com — _ dliptrot@e-zlegal.com 
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The Virtual Bookstore™ 








To order books in this magazine or, any book. Please call 24 hrs/365 days: (800) BOOKS-NOW (266-5766) or 
(702) 258-3338 ask for ext. 1410 or visit us on the web at http://www.BooksNow.com/Dr.Dobbs. Use Visa, M/C, 
or AMEX or send check or money order + $4.95 S&H ($2.50 each add'l item) to: Books Now, 448E 6400 South, 


Ste. #125 Salt Lake City, UT 84107 
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Copy Protection ¢ Electronic Software Distribution * License Management 


Features: 


* Wraps Your Software in Minutes 
* Try-Before-You-Buy & Immediate Purchase 
* Software Metering & Rentals 


Windows & Macintosh Compatibility 
Foreign Language Support 
Key Diskette Option 

Sales: [408} 297-7444 ext.} 


www.paceap.com 


@ 1998 PACE Anti-Piracy. Ail rights reserved worldwide. interLok is o trademark of PACE Anti-Piracy. 





¢ C-CALL ($69) Graphic-tree of 
caller/called function hierarchy, cross- 
reference, file/function index. 

* C-CMT ($69) 
Creates/inserts/updates comment- 
blocks (functions/identifiers used) for 
each function. 

e C-METRIC ($59) Calculates path 
complexity, counts lines with 
comments, code, 'C' statements 

¢ C-LIST ($69) Lists and action- 

diagrams, or reformats source into 

user-selected standard formats 


 C-REF ($69) Creates cross-reference 
of local/global/define/parameter 
identifiers, class trees. 

¢ C-DOC ($199) PACKAGE All 5 
programs integrated as 1 overall 
C-DOC program. <10,000 lines. 
JavTREE graphic-tree viewer ($free in 
C-DOC). 

¢ C-DOC Professional ($299) DOS, 

Win95/NT, OS/2, 1,000,000+ lines. 



























¢ VERSION 8.0! 
30-Day Money-back guarantee. 


SOFTWARE BLACKSMITHS INC. email @ swbs.com 
6064 St Ives Way, Mississauga _ Voice/Fax (905) 858-4466 
ONT Canada L5N-4M1 http://www.swbs.com 
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The Free Software Foundation (FSF) has re- 
leased GNOME (GNU Network Object Mod- 
el Environment) 1.0, an integrated desktop 
environment designed to run on GNU/Lin- 
ux systems. GNOME includes a drag-and- 
drop-enabled desktop that use the standard 
Xdnd and Motif protocols, the ability to as- 
sign an icon to a file or URL, international- 
ization support, and support for scripting 
and compiled languages, including Ada, C, 
C++, Objective-C, TOM, Perl, and Guile. 
GNOME is available for free download at 
http://www.gnu.org/, http://www.gnome 
.org/, and several other mirror sites. 

Free Software Foundation 

59 Temple Place, Suite 330 

Boston, MA 02111 

617-542-5942 

http://www.gnu.org/ 


Instantiations has announced the forma- 
tion of the Java Performance Lab JPL). 
The JPL system combines the company’s 
Java performance technologies —JOVE 
and the Flash Compiler—with technical 
consulting and support. The JPL system 
gives customers a performance-tuned ver- 
sion of its Java program, JPL optimization 
and compilation tools, JPL consulting and 
lab time, ongoing performance, and de- 
ployment of technical support. 
Instantiations Inc. 

7618 S.W. Mohawk Street 

Tualatin, OR 97062 

503-612-9337 


http//www.instantiations.com/ 


General Software has released Embedded 
BIOS Version 4.2, a configurable BIOS for 
embedded systems. Over 400 configura- 
tion options can be selected using BlOStart, 
a rule-based expert system that makes 
BIOS adaptation clear and straightforward. 
Embedded BIOS 4.2 addresses the whole 
lifecycle of the embedded process, in- 
cluding board bring-up, configuration pro- 
totyping, testing with system diagnostics, 
manufacturing mode, and in-field diag- 
nostics and software reload. Version 4.2 ex- 
tends firmware support for the latest ref- 
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erence designs from several silicon manu- 
facturers, including Intel’s Low-Power Em- 
bedded Pentium Processor with MMX Tech- 
nology Development Platform (Pentium and 
Pentium II); AMD’s K6, Elan SC400; Am486 
Microprocessor Customer Development Plat- 
form, and Elan SC300; Cyrix’s MediaGXm 
Processor development platforms; ST Mi- 
croelectronics’s STPC Industrial and STPC 
Consumer Embedded Microprocessors; and 
others. Support for PCI bridging, AGP, 
SDRAM, SPC, and a copy of the Caldera Sys- 
tem Builder Kit are also included. Addi- 
tionally, the Embedded BIOS 4.2 Adapta- 
tion Kit now supports build ups to 256 KB. 
General Software Inc. 

12737 Bel-Red Road, Suite 100 

Bellevue, WA 98005 

425-454-5755 


http://www.gensw.com/ 


Cygnus Solutions has announced an open 
source Java Programming Language test 
suite intended to substantiate Java API com- 
patibility. The Java test suite was developed 
according to published Java API docu- 
mentation and is the product of the Open 
Source Mauve project, a collaborative ef- 
fort initiated by Cygnus between clean- 
room Java technology developers who do 
not have access to Sun Microsystem’s Java 
Compatibility Kit. This test suite allows or- 
ganizations developing clean-room Java li- 
braries (an essential component of the Java 
language specification) to test compatibility 
against Sun’s published Java standards with- 
out compromising their clean-room imple- 
mentations. The test suite is freely available 
at http://sourceware.cygnus.com/mauve/. 
Cygnus Solutions 

1325 Chesapeake Terrace 

Sunnyvale, CA 94089 

408-542-9600 

http://www.cygnus.com/ 


The Component Vendors Consortium (CVC) 
is a nonprofit organization whose purpose 
is to advance and promote the use of third- 
party software components and tools by 
developers, and to enhance the public’s un- 
derstanding of the reliability of third-party 
tools and the strategic advantage inherent 
in using third-party components and tools. 
A key objective of the CVC is to develop 
an objective testing, measuring, and brand- 
ing process to identify vendors and com- 
ponents that meet rigorous quality and sup- 
port standards. The membership of the CVC 
consists of companies that develop and mar- 
ket commercial software components 
and/or development tool products. Found- 
ing CVC members include Advantageware, 
APEX Software, Artisoft, Bennet-Tec, Com- 
puWare, Dart Communications, Data Dy- 
namics, Data Techniques, Desaware, DBI 
Technologies, Distinct, FarPoint Technolo- 





gies, LEAD Technologies, Modern Software, 
ProtoView Development, Sax Software, Sea- 
gate Software, Sheridan Software, VideoSoft, 
and Wise Solutions. Software vendors or de- 
velopers who are interested in membership 
in the CVC should see CVC’s web site for 
more information. 

Component Vendors Consortium 
http://www.components.org/ 


DOME 1.0 from Experimental Object Tech- 
nologies is a set of visual tools for devel- 
opers of distributed COM applications. 
DOME 1.0 supports graphical modeling, 
code generation, simulation, deployment, 
monitoring and management, and inte- 
grates with Microsoft Visual C++. DOME 
enables COM application developers to 
rapidly create and launch application pro- 
totypes, debug timing and synchroniza- 
tion and estimate performance of a dis- 
tributed application via simulation, deploy 
the application components over the tar- 
get network, monitor the distributed ap- 
plication, and manage the distributed ap- 
plication remotely via COM interfaces. 
Experimental Object Technologies 
http://www.xjtek.com/products/dome/ 


Perforce Software is shipping Release 99.1 
of the Perforce Software Configuration Man- 
agement System, a comprehensive software 
configuration management (SCM) system 
that helps software development teams ef- 
ficiently manage large, complex develop- 
ment projects across multiple operating sys- 
tem platforms. This version improves 
performance with a new data compression 
capability in the Perforce API that enables 
the GUI to run two to four times faster than 
previously over slow network connections. 
In addition, a new client-side compression 
option speeds data transfer by as much as 
10 times over the basic file transfer proto- 
col, especially over wide-area networks. 
Other new features include the ability of 
the administrator to limit the size of oper- 
ations that particular users can undertake 
and the introduction of presubmit triggers. 
A single-user license costs $600.00. 
Perforce Software Inc. 

2420 Santa Clara Avenue, Suite 200 
Alameda, CA 94501 

510-864-7400 

http://www.perforce.com/ 


The RPK Encryptonite Software Toolkit 3.1 
from RPK Security claims increased engine 
performance over previous versions and a 
new Indexed Encryption for high-data- flow 
applications like streaming multimedia. The 
RPK Encryptonite Toolkit offers perfor- 
mance increase on initialization and key 
creation and accommodates out-of-order, 
missing or corrupted/garbled data, without 
noticeable performance degradation or 
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reduction in security. Version 3.1 also 
adds additional sources of randomness 
for use in a variety of applications, in- 
cluding server-only situations where user 
input is not normally available. The RPK 
Toolkit includes ANSI standard C/C++ li- 
braries for Windows 95/98/NT, HP/UX, 
Sun Solaris (C only), and Linux, Delphi 
3.0/4.0 VCL component for Windows 
95/NT, DLL, and ActiveX. It also has been 
compiled and tested with Visual C++, Bor- 
land C++ Builder, and gnu/g++. Because 
the RPK Toolkit is developed outside the 
US., it is available worldwide with strong 
encryption. Pricing starts at $695.00 per 
developer; deployment license fees are 
based upon custom configurations. 

RPK Security Inc. 

1755 Filbert Street, Suite 1U 

San Francisco, CA 94123 

212-488-9891 

http://www.rpk.com/ 


ILOG has introduced ILOG OPL Studio, a 
design environment that allows develop- 
ment and deployment of supply-chain op- 
timization applications. ILOG OPL Studio 
combines several optimization methods, 
letting you identify the best approach for 
your application. Features include an on- 
line model library, database connectivity 
tools, debugging tools, and an automatic 
code generator, all within a graphical en- 
vironment. The Optimization Programming 
Language (OPL) provides advanced data 
representation features suited for opti- 
mization applications. The OPL language 
lets you represent and solve optimization 
problems using linear and integer pro- 
gramming, constraint programming, and 
scheduling techniques. ILOG OPL Studio 
is available for Windows 95/98/NT and 
Sun SPARC/Solaris platforms. It is offered 
as an add-on to the ILOG Optimization 
Suite component libraries. There is also a 
version called OPL Studio Pro that includes 
integrated versions of ILOG’s optimizers. 
ILOG 

1080 Linda Vista Avenue 

Mountain View, CA 94043 

650-567-8000 

http://www. ilog.com/ 


ASSET has announced the ASSET Internet 
Application Server (IAS) 1.0, an Internet 
application server initially targeted for 
Smalltalk programmers. IAS includes an 
object-oriented framework (WEBArtifacts) 
and scripting language (WEBScript, simi- 
lar to Smalltalk) for developing web-based 
applications. IAS combines a web server 
and application server in a single execu- 
tion environment. Multiple instances of 
IAS may be clustered to achieve higher 
performance and availability. ASSET pro- 
vides a separate bridge to make IAS ac- 
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cessible via IS organizations’s existing web 
infrastructures. IAS for Windows NT lists 
for $2000.00. WEBArtifacts is available as 
a service from ASSET and its price de- 
pends on the size and scope of a project. 
ASSET Inc. 

P.O. Box 1236 

New York, NY 10156 

212-689-6489 


http://www.assetinc.com/ 


TranDyne, a division of Transitive Dy- 
namics, announced Dr. Parse, a set of util- 
ities and tools for parser development. Dr. 
Parse deploys DFA through LR(1) language- 
independent parse generators and can sup- 
port multiple DFA through LR parse engines 
in a single application. Dr. Parse converts 
grammars into a parse data stream, detect- 
ing syntactical errors, and generating data 
into the developer’s choice of languages. 
The Parser Foundation Classes (PFC) as- 
semble parsing components, allowing ma- 
nipulation of data. Dr. Parse VIP (Visually 
Interactive Parser) lets you view a variety of 
parsing modes and detect invalid sentences 
and incorrect ASTs. Its IDE consists of a syn- 
tax highlighter and debugger. The IDE lets 
you input a grammar and check for errors 
in syntax. Once the grammar is deemed syn- 
tactically correct, the parser’s conflict reso- 
lutions can be viewed and modified. This 
static check can then be followed by a se- 
ries of dynamic test scenarios using the VIP. 
Transitive Dynamics Inc. 

14150 NE 20th Street, Suite 373 
Bellevue, WA 98007 

425-519-3640 


http://www.trandyne.com/ 


The 8051 Virtual Workshop from Cross- 
ware Products lets 8051 programmers sim- 
ulate their complete target system. The 
8051 Virtual Workshop allows the software 
debugging and verification process to pro- 
ceed in the absence of any target hard- 
ware. The 8051 Virtual Workshop runs un- 
der Windows 95/98/NT. At the heart of 
the system is an 8051 Instruction Set Sim- 
ulator with full source-level and graphi- 
cal debugging facilities. 

Crossware Products 

Old Post House, Silver Street 

Litlington, Royston, Herts 

United Kingdom SG8 0QE 

44 1763 853500 


http://www.crossware.com/ 


Sheridan Software has introduced Code- 
Assist, a new code-generation software 
package. Using templates, CodeAssist 
helps Visual Basic developers create data 
access and other routines in Visual Basic, 
HTML, and SOL. A collection of more than 
100 prebuilt templates are provided for 
common code requirements for calling 





data objects from two-tier and multitier 
applications, as well as interactive browsers 
for accessing and manipulating databas- 
es, data objects, and templates. Users can 
modify or customize the prebuilt templates 
to meet their exact requirements or cre- 
ate and save their own templates as need- 
ed. This first release works with Microsoft 
Access and SQL Server. CodeAssist costs 
$295.00. 

Sheridan Software Systems Inc. 

35 Pinelawn Road, Suite 206E 

Melville, NY 11747 

516-753-0985 


http://www.shersoft.com/ 


Monotype Typography has announced the 
New Media Core Fonts set, a new collec- 
tion of fonts designed for high legibility 
on low-resolution devices such as com- 
puter screens, consumer electronic de- 
vices, and television screens. The New 
Media Core Fonts are part of Monotype’s 
Enhanced Screen Quality (ESQ) line of 
TrueType fonts that have been optimized 
for viewing in environments such as in- 
teractive TVs, palmtop computers, and in- 
formation appliances. They were specifi- 
cally developed by Monotype designers 
and engineers to optimize legibility on de- 
vices that have a limited number of pix- 
els with which to display fonts. Albany 
ESQ, Thorndale ESQ, and Cumberland 
ESQ font families each contain the regu- 
lar, italic, bold, and bold italic styles. They 
are metrically compatible with the origi- 
nal Windows core set so documents cre- 
ated with either core set of fonts will not 
reflow. Monotype can supply the New Me- 
dia Core Font set in a Unicode-compliant 
format with a variety of script support for 
its Customers. 

Monotype Typography Inc. 

985 Busse Road 

Elk Grove Village, IL 60007 
847-718-0400 
http://www.monotype.com/ 


Nettech Systems has introduced the Net- 
tech Developer Suite, a set of SDKs for 
wireless-enabling applications. The Net- 
tech Developer Suite bundles the Nettech 
6RFexpress, Smart IP, and InstantRF SDKs 
for all the networks and operating systems 
supported by each product into one com- 
prehensive package. The Nettech Devel- 
oper Suite is sold as an annual subscrip- 
tion of $995.00 per year/per developer. 
Nettech Systems 

600 Alexander Road 

Princeton, NJ 08540 

609-734-0300 
http://www.nettechrf.com/ 
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Corrections 
n article entitled “The Ultimate Y2K Filter” in our April issue may have caused some 
J cosisin among readers. It somehow slipped past the usually keen perceptions of our crack 
team of editors that the author of the article was borderline delusional. This misguided 
wretch was apparently under the mistaken impression that the entire computer field was afflicted 
with something called the “Y to K” problem, rather than the “Y2K” problem. It was this former 
“problem” that he attempted to solve, and, to be fair, he did a pretty good job of it. His proposed 
solution was a program that replaces “Y” with “K” in text documents. Just to be on the safe side, 
he also had it replace all original “K”s with “Y”s as well. While this should only have made the 
article worthless, one as-yet-unidentified member of the editorial staff apparently thought that it 
would be a good idea to pass the article itself through the Y2K, or rather YtoK, filter, rendering 
the copy not only worthless but also somewhat below our usual standards of readability. We 
apologize for any inconvenience that this may have caused. 

Due to a copyediting error, an article on the Linux operating system in that same issue referred 
to Linus Torvalds as “that Swedish meatball” and “the illegitimate son of Burgess Meredith.” Mr. 
Torvalds is, of course, from Finland. Furthermore, if the reader will mentally replace “gnome” 
with “GNOME,” “demon” with “daemon,” and “eunuchs” with “Unix” throughout the article, it 
will read more like an essay on operating-systems technology and less like a selection from 
Grimm's Fairy Tales. 

A news item in that issue correctly reported that Hewlett-Packard was splitting into two 
companies. However, due to that dreamy state of mind that editors get into late Friday afternoon, 
it Went on to say that the two companies that Hewlett-Packard would split into were Burger King 
and Chrysler. This is, we find on checking our sources, not precisely the case. We bitterly regret 
the error. 

It is not true that the Australian government’s decision to require Internet Service Providers to 
remove Australian sites that have sexually explicit material would “undermine the country’s chief 
export,” as another news item in the issue reported. This error was the result of fanatical 
antiAustralian prejudice on the part of a junior-level editor, who has since been promoted to an 
executive position where she can do no further harm. 

Several errors also crept into that issue’s review of Bill Gates’ latest bestseller, Business at the 
Speed of Thought. The book is a guide for companies wanting to get into e-commerce, making 
the reviewer’s characterization of it as “a soft-core romance novel with characters borrowed from 
Melrose Place” clearly inappropriate. Waggener Edstrom, Microsoft’s and Mr. Gates’ public 
relations firm, furthermore denies the reviewer's assertion that “a serial number has been 
embedded in the book to monitor readers’s every movement.” We contacted the reviewer during 
one of his rare sober moments and concluded that he was referring to the barcode that set off an 
alarm when he attempted to leave the bookstore without having paid for the book. 

In addition, we now understand that the kayak on which Hewlett-Packard is supporting Linux 
is a Kayak workstation, which pretty much invalidates everything contributing editor Al Stevens 
had to say in his column for that issue about the “seaworthiness” of Linux. Michael Swaine’s 
“Swaine’s Flames” column for that issue was nothing more than a thinly veiled attack on all that is 
good and noble in this world, for which we apologize. And Jonathan Erickson’s editorial on 
installing drywall appears, now that we actually read it, to have been intended for another 
magazine altogether. 





HicbaD Sacads 


Michael Swaine 
editor-at-large 
mswaine@swaine.com 
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